Skip to content

Generate parameter grid

Martin Müller edited this page Jun 19, 2020 · 2 revisions

Generate parameter grid

You can use the python main.py generate_config command to generate a config (instead of writing one manually).

The command has the following arguments:

  --name NAME           Global name prefix and name of output file. (default: None)
  --train-data TRAIN_DATA
                        Train data path (default: None)
  --test-data TEST_DATA
                        Test data path (default: None)
  -m MODELS [MODELS ...], --models MODELS [MODELS ...]
                        List of models. Eeach model will be combined with each param pair. (default: None)
  -p [PARAMS [PARAMS ...]], --params [PARAMS [PARAMS ...]]
                        Arbitrary list of grid search params of the format `key:modifier:values`. Key=hyperparameter name, modifier=Can be either `val` (individual values), `lin` (linspace), or `log` (logspace), followed by the respective values or params for the lin/log space. Examples: num_epochs:val:2,3 or learning_rate:log:-6,-2,4 (default: [])
  -g [GLOBALS [GLOBALS ...]], --globals [GLOBALS [GLOBALS ...]]
                        List of global params which will be passed to all runs of the format `key:value` (default: [])

Common/global parameters are specified with the -g option, and the grid-searched parameters with the -p argument.

Example

In order to generate a grid for fasttext hyperparameter of the four hyperparameters (dim, n_grams, num_epochs and learning_rate).

DATA_PATH=../data/annotation_data/train/category_merged
RUN_PREFIX=category_merged_grid_v1
python main.py generate_config \
  --name $RUN_PREFIX \
  --train-data train.csv \
  --test-data dev.csv \
  -m fasttext \
  -g write_test_output:true data_path:$DATA_PATH overwrite:true replace_user_with:user replace_url_with:url \
save_model:false min_num_tokens:0 min_num_chars:0 standardize_punctuation:true remove_stop_words:false \
asciify_emojis:false expand_contractions:true remove_emojis:false asciify:false \
  -p dim:val:10,20,30,50,100 n_grams:val:1,2,3 num_epochs:val:100,200,300 learning_rate:val:0.06,0.08,0.1,0.12

Running this script should generate a file config.category_merged_grid_v1.json.

You can now run this config. For FastText we can run this config in parallel:

python main.py train -c config.category_merged_grid_v1.json --parallel
Clone this wiki locally