Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAHPO Gym always requries full configuration, also in case of forbidden hyperparameters #94

Open
LukasFehring opened this issue Mar 5, 2025 · 5 comments

Comments

@LukasFehring
Copy link

LukasFehring commented Mar 5, 2025

I observed this behavior, for example, in the case of "rbv2_ranger" on '470'.

The following example is created with a fresh environment and yahpogym . check=false is required because config space 0.6.1 does not contain the needed check_valid_configuration method

from yahpo_gym import benchmark_set

benchmark = benchmark_set.BenchmarkSet(scenario="rbv2_ranger", check=False)
benchmark.set_instance(value="470")

config = {
    "min.node.size": 50,
    "mtry.power": 0.0,
    "num.impute.selected.cpo": "impute.mean",
    "num.trees": 1000,
    "respect.unordered.factors": "ignore",
    "sample.fraction": 0.55,
    "splitrule": "gini",
    "task_id": "470",
}

print(benchmark.objective_function(config))
@sumny sumny changed the title YahpoGym always requries full configuraiton, also in case of forbidden hyperparameters YAHPO Gym always requries full configuration, also in case of forbidden hyperparameters Mar 5, 2025
@sumny
Copy link
Collaborator

sumny commented Mar 5, 2025

Thanks for opening this issue.
Can you maybe elaborate a bit on what exactly the issue is and what the expected behavior should be?
Let me elaborate:

benchmark.get_opt_space()
Configuration space object:
  Hyperparameters:
    min.node.size, Type: UniformInteger, Range: [1, 100], Default: 50
    mtry.power, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.0
    num.impute.selected.cpo, Type: Categorical, Choices: {impute.mean, impute.median, impute.hist}, Default: impute.mean
    num.random.splits, Type: UniformInteger, Range: [1, 100], Default: 1
    num.trees, Type: UniformInteger, Range: [1, 2000], Default: 1000
    repl, Type: Categorical, Choices: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, Default: 10
    replace, Type: Categorical, Choices: {TRUE, FALSE}, Default: TRUE
    respect.unordered.factors, Type: Categorical, Choices: {ignore, order, partition}, Default: ignore
    sample.fraction, Type: UniformFloat, Range: [0.1, 1.0], Default: 0.55
    splitrule, Type: Categorical, Choices: {gini, extratrees}, Default: gini
    task_id, Type: Constant, Value: 470
    trainsize, Type: UniformFloat, Range: [0.03, 1.0], Default: 0.525
  Conditions:
    num.random.splits | splitrule == 'extratrees'

tells you how a configuration should look like based on the optimization space (which sets the instance value to a constant and potentially drops fidelity parameters fixing them at the highest value); i.e. benchmark.config_space is not necessarily the search space that is optimized over but contains more parameters than the actual optimization space which you get with get_opt_space().

if we sample a point from the optimization space we can also see what the benchmark expects to be part of a configuration:

benchmark.get_opt_space().sample_configuration(1)
Configuration(values={
  'min.node.size': 71,
  'mtry.power': 0.9155386229529013,
  'num.impute.selected.cpo': 'impute.hist',
  'num.random.splits': 34,
  'num.trees': 1485,
  'repl': '1',
  'replace': 'TRUE',
  'respect.unordered.factors': 'partition',
  'sample.fraction': 0.16011881922566848,
  'splitrule': 'extratrees',
  'task_id': '470',
  'trainsize': 0.5538487728906515,
})

i.e., a configuration must always contain all parameters that are active.
This is due to the surrogate model being trained over all instances and points for a given scenario and being able to handle missing value imputation and therefore if you disable checks, the surrogate will still return a prediction (even for an incomplete configuration, because it can handle missing values, but it will likely return output that is not sensible).

If you keep check = True

benchmark = benchmark_set.BenchmarkSet(scenario="rbv2_ranger", check=True)

config = {
    "min.node.size": 50,
    "mtry.power": 0.0,
    "num.impute.selected.cpo": "impute.mean",
    "num.trees": 1000,
    "respect.unordered.factors": "ignore",
    "sample.fraction": 0.55,
    "splitrule": "gini",
    "task_id": "470",
}

print(benchmark.objective_function(config))

you will actual be told that your point is not fully specified:

ValueError: Active hyperparameter 'repl' not specified!

So in general, unless you always specify points fully, setting check = False can be misleading because the surrogate still tries to predict values for a configuration that it has never seen and actually also does not exist in this space (as it requires the full specification of all active parameters part of the optimization space).

What I am not sure about is the "check=False is required because ConfigSpace 0.6.1 does not contain the needed check_valid_configuration method" part of your question.
Can you provide more details here? As far as I recall, check = True here will perform an internal check of the configuration that you provide prior to evaluating it with the surrogate model and this should be done with the check_configuration method of the config_space object itself of class ConfigSpace.configuration_space.ConfigurationSpace.
Admittedly, YAHPO Gym still is in need of some overhaul to work with newer versions of ConfigSpace (which hopefully will eventually be done with a v2) but the check itself should work.

@LukasFehring
Copy link
Author

LukasFehring commented Mar 6, 2025

Hi, Thank you for your swift answer :)

Yes, the issue appears to be caused by us using the new configspace. Due to the old ConfigSpace, SMAC and other libraries can, by default, not be used with yahpogym. For that reason, we use an updated form of configspace created by @benjamc and patch local references. This was done for CARP-S.

git clone https://github.com/benjamc/yahpo_gym.git lib/yahpo_gym
$CONDA_RUN_COMMAND $PIP install -e lib/yahpo_gym/yahpo_gym
cd $CARPS_ROOT/carps
mkdir benchmark_data
cd benchmark_data
git clone https://github.com/slds-lmu/yahpo_data.git
cd ../..
$CONDA_RUN_COMMAND python $CARPS_ROOT/scripts/patch_yahpo_configspace.py
$CONDA_RUN_COMMAND $PIP install ConfigSpace --upgrade

In order to still be able to use the library, we start from a default configuration and replace all optimized parameters as indicated as indicated below. Would you suggest setting those parameters differently? We would assume that the surrogate would be trained with the defaults?

def _train(self, config: Configuration, seed: int = 0):
    # Start with default config and replace values. Otherwise YahpoGym fails
    final_config = self.benchmark._get_config_space().get_default_configuration()
    for name, value in config.items():
        final_config[name] = value

    res = self.benchmark.objective_function(configuration=final_config)

@sumny
Copy link
Collaborator

sumny commented Mar 10, 2025

This code snippet might work (as apparently specifying a parent parameter in a ConfigSpace configuration will correctly drop children parameters not correctly matching the parent) but depending on what the logic is you are using to generate a config above, I would expect this might also fail and I guess there is no guarantee from our side that the surrogate does what it is supposed to do, especially if checks are disabled.

Note that if you disable checks you will be able to predict on any, meaningless, configuration:

benchmark = benchmark_set.BenchmarkSet(scenario="rbv2_super", check=False)
config = {
  "learner_id": "rpart",
  "num.impute.selected.cpo": "impute.median",
  "repl": 6,
  "rpart.cp": 0.73,
  "rpart.maxdepth": 22,
  "rpart.minbucket": 65,
  "rpart.minsplit": 48,
  "task_id": "24",
  "trainsize": 1.0,
  "aknn.M": 34,
  "aknn.distance":  "l2"
}

The config above is a weird mix of a correctly fully specified rpart decision tree configuration but also contains some knn hyperparameters. With checks disabled, the surrogate will not care:

benchmark.objective_function(config)
[{'acc': 0.9515822,
  'bac': 0.6554274,
  'auc': 0.8728429,
  'brier': 0.06943077,
  'f1': 0.47613993,
  'logloss': 0.2400041,
  'timetrain': 53.49286,
  'timepredict': 0.24357893,
  'memory': 223.24962}]

And, you will get a prediction!
Although the configuration itself does not make any sense.
Therefore, if you disable checks, it is your full responsibility to guarantee that only sensible conjurations can and will be generated and evaluated.

Some background on the surrogates:
"We would assume that the surrogate would be trained with the defaults?"
I do not fully understand this sentence, sorry.
The configuration space describes the search space of the original experiments used for data collection (which we are then analogously using to perform HPO on the surrogate trained on this data).
From these configuration spaces many HPC's were generated at random (with some exception of the repl and trainsize parameters) and the surrogate was trained on this collected data.
To handle hierarchical search spaces and dependencies, the surrogate supports missing value imputation (e.g., so that during training and prediction the hyperparameters of the knn can be set to NA when looking at a configuration of the rpart decision tree, etc.)
This NA handling mechanism is also the reason why the surrogate, if checks are disabled, will be able to predict for any point that you generate - regardless if it is meaningful or not -
Defaults do not carry any special meaning - they are simply the values ConfigSpace choose to assign as defaults when the configspaces were generated.

@LukasFehring
Copy link
Author

"We would assume that the surrogate would be trained with the defaults", was mainly meant as a code description. Nevertheless, it would be an interpolation mechanism. How would you propose imputing these values?

@sumny
Copy link
Collaborator

sumny commented Mar 10, 2025

You as a user are not required and must not impute the missing values.
The surrogate model will do this itself.
I.e.. for the point:

config = {
  "learner_id": "rpart",
  "num.impute.selected.cpo": "impute.median",
  "repl": 6,
  "rpart.cp": 0.73,
  "rpart.maxdepth": 22,
  "rpart.minbucket": 65,
  "rpart.minsplit": 48,
  "task_id": "24",
  "trainsize": 1.0
}

all parameters of all other learners are missing when looking at the rbv2_super search space.
Therefore, the surrogate will correctly perform missing value imputation (e.g., new factor level for missing categoricals, out of range imputation or something similar for numericals) for all these parameters on its own.

Staying with the example above, if you would pass a weird config such as:

config = {
  "learner_id": "rpart",
  "num.impute.selected.cpo": "impute.median",
  "repl": 6,
  "rpart.cp": 0.73,
  "rpart.maxdepth": 22,
  "rpart.minbucket": 65,
  "rpart.minsplit": 48,
  "task_id": "24",
  "trainsize": 1.0,
  "aknn.M": 34,
  "aknn.distance":  "l2"
}

the issue is that in fact aknn.M and aknn.distance are not missing and therefore no missing value imputation will be performed by the surrogate which will result in a weird prediction (admittedly for a point that is not a valid configuration anyways).

So to summarize, you as a user should never perform missing value imputation when calling the objective function, i.e., the surrogate and neither should you pass such weird configurations as the one I created above when checks are disabled.

Coming back to your original question from the start. The only thing you do need to provide on top of a valid configuration are hyperparameters such as the task_id, repl and trainsize (i.e., instance identifier of a benchmark from a benchmark scenario, as well as the fidelity parameters). This may seem somewhat counterintuitive at first but it makes sense when thinking about on what kind of data the surrogate was trained and that it was trained over different instance ids, and fidelity parameters.

Therefore, like I already mentioned above, we usually do the following in code examples:

from yahpo_gym import benchmark_set

benchmark = benchmark_set.BenchmarkSet(scenario="rbv2_ranger", check=True)
benchmark.set_instance(value="470")  # fix a given instance id

search_space_with_fidelity = benchmark.get_opt_space()
search_space_without_fidelity = benchmark.get_opt_space(drop_fidelity_params=True)

Note that the only difference between both search spaces is that in search_space_without_fidelity train_size and repl has been removed. Both still include task_id as a constant (set to the value "470" as specified by us).

config_with_fidelity = search_space_with_fidelity.sample_configuration()
config_without_fidelity = search_space_without_fidelity.sample_configuration()

The point config_with_fidelity is fully specified and valid and we can evaluate it directly:

benchmark.objective_function(config_with_fidelity)

The other point config_without_fidelity, however, will need values for trainsize and repl to be added and the checks will tell you this:

benchmark.objective_function(config_without_fidelity)
ValueError: Active hyperparameter 'repl' not specified!
config_without_fidelity_readded = config_without_fidelity.get_dictionary()
config_without_fidelity_readded["repl"] = config_with_fidelity["repl"]
config_without_fidelity_readded["trainsize"] = config_with_fidelity["trainsize"]
benchmark.objective_function(config_without_fidelity_readded)

Maybe we will increase usability here in an eventual v2.0 version.
For some more examples, you might also find https://github.com/slds-lmu/yahpo_gym/blob/main/yahpo_gym/notebooks/tuning_hpandster_on_yahpo.ipynb helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants