You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Users should have a way to specify additional Spark properties alongside those recommended by the AutoTuner/Bootstrapper.
The goal is to make the final tuning file ready for direct use by consolidating properties from the AutoTuner/Bootstrapper with the user-provided spark properties.
For example, the AutoTuner/Bootstrapper does not recommend certain Spark properties due to potential risks (data integrity etc). However, users should have the option to enable them if needed:
The autotuner should not recommend these values. There should be a way where the user says it is ok to run in the "incompat, I dont care, rampant" mode which can be supplied as a config file beyond what the autotuner recommends. If we give these configs as expected , we break data integrity in some cases quite silently. The user should opt in to this.
Based on offline discussions, we should allow users to specify custom spark_properties in addition to those recommended by the AutoTuner.
The goal is to make the final tuning file ready for direct use, containing a combined set of properties from the AutoTuner/Bootstrapper along with user-defined configurations.
Updated the description to reflect this.
parthosa
changed the title
[BUG] AutoTuner/Bootstrapper: Enable configurations for Columnar2Row
[BUG] AutoTuner/Bootstrapper: Support user-provided spark properties
Feb 5, 2025
parthosa
changed the title
[BUG] AutoTuner/Bootstrapper: Support user-provided spark properties
[FEA] AutoTuner/Bootstrapper: Support user-provided spark properties
Feb 5, 2025
Isn't this the typical usage of the tuner output? Users can append to it whatever they find necessary including non-rapids properties.
Users still have to wrap their own configs anyway to suit their CSP environment or submission scripts..etc.
Profiler's context: it makes sense that the autotuner should process all rapids configs and add comments or make recommendations.
Qual's context: the Spark properties are pulled from eventlog/clusterInfo. So, I am not sure where the rapids* properties are coming from? Or how this is supposed to be passed to the rapids_spark python CLI.
Users should have a way to specify additional Spark properties alongside those recommended by the AutoTuner/Bootstrapper.
The goal is to make the final tuning file ready for direct use by consolidating properties from the AutoTuner/Bootstrapper with the user-provided spark properties.
For example, the AutoTuner/Bootstrapper does not recommend certain Spark properties due to potential risks (data integrity etc). However, users should have the option to enable them if needed:
Possible Ideas:
--tools_config_file
cc: @viadea @kuhushukla
The text was updated successfully, but these errors were encountered: