Controlling Hyper-parameter Tuning Using Expert TOML Configuration Settings
The goal of this document is to show how to use the expert TOML configuration settings to control model hyper-parameter tuning.
In Driverless AI 1.10.2 the control of model hyper-parameter tuning can accomplished using either Custom Individual recipes, or the expert toml configuration settings. The recommended approach is to use the toml settings unless more detailed code control is needed, for example, to force certain features on or off. There are some caveats to using toml configuration setting outlined here that are being addressed in the coming releases.
Toml settings can be set as environment variables, in the config.toml file, in the tabs of the DAI EXPERT SETTINGS panel, or the TOML EDITOR on the EXPERT SETTINGS panel.
Some of the toml settings are administrative and others are user based. Certain settings are applied when the server starts, including those related to authentication and data connectors. These cannot be changed after the server is started. For example, one cannot change any of the data upload configuration options after the server has started.
Here we are concerned with the EXPERIMENT settings. These can be accessed after choosing to PREDICT a dataset where we now have available toml options for the experiment control.
The EXPERIMENT panel options (like Accuracy, Time, and Interpretability) are not controlled by TOML settings. The toml expert settings along with these experiment settings are passed together to the experiment as shown in the PREVIEW.
If you launch experiment, DAI executes the plan as described. If you CREATE LEADERBOARD DAI does a number of different experiments that are reasonable based on the settings. You can then review and compare these results. For example, e.g. just GLM…. Just Random Forest … unless DAI was limited to not use RF.
In the DAI documentation under configuration/experiment configuration you can see there are a very large number of options. They are categorized in the DAI UI… EXPERIMENT, MODEL, FEATURES, TIMESERIES, NLP, IMAGE, RECIPES, SYSTEM, and AUTODOC
Types of EXPERIMENT TOML configs
- Those listed in the UI (must be exposed, categorized, and tagged to a panel). You can search for these in the DAI UI. You can also use the EDIT TOML feature to add and change these settings. Note that, in general, settings which appear in the H2O AI Driverless documentation cannot be managed in the EDIT TOM. This behavior is scheduled to be addressed in future DAI versions so that all the toml settings will be treated the same. On caveat is that when editing the TOML, dictionaries must come last.
- Those not exposed to the UI. These must be added with EXPERIMENT expert setting “add to config.toml via toml string” that we call “any toml”, or added with TOML EDITOR using a special key
- Toml key is config_overrides, value is a key/value toml string
- Multiple settings can be entered separated by \n (i.e. new line), with no spaces
- These are parsed after the other tom setting in case the order matters to your experiment configuration
NOTE: If setting isn’t exposed and you try to put it in edit TOML, when you save and reopen, you will find that the erroneous entry has disappeared
Searching in the docs we see the params_tune_xgboost. Note that cut-n-paste gives strange quotes characters that need to be changed. This will be changed and documented in version 1.10.3
As an example, here we set up a custom hyper0parameter grid search for a XGBoostGBM model. To select only XGBoostGBM we could use MODELS (AUTO/ON/OFF) expert setting under the MODELS tab.
It’s better to use expert settings RECIPE>Include specific models which overrides the settings from the MODEL tab.
Changes are now visible in the edit TOML panel. Conversely, If we add “GLM” to included_models = ["LightGBM", "GLM"] in edit TOML and save, we can see the changes in the RECIPE>Include specific models settings.
Now from the H20 AI Driverless documentation, cut-n-paste the params_tune_xgboost into the “any toml” and correct the single and double quotes. The entry now appears in the edit TOML with the double quotes escaped with the backslash “\”. This defines both the default hyperparameters and the mutations away from these during the evolution phase.
Note that TOML keys need to be unique and there is a preferred order of the settings:
- Lists, boolean, strings come first in the toml file
- Dictionaries must follow the above value types
Note: Settings internal to DAI like the “last recipe” and “[recipe activation]”. Should not be changed!