-
Notifications
You must be signed in to change notification settings - Fork 0
Generate parameter grid
Martin Müller edited this page Jun 19, 2020
·
2 revisions
You can use the python main.py generate_config
command to generate a config (instead of writing one manually).
The command has the following arguments:
--name NAME Global name prefix and name of output file. (default: None)
--train-data TRAIN_DATA
Train data path (default: None)
--test-data TEST_DATA
Test data path (default: None)
-m MODELS [MODELS ...], --models MODELS [MODELS ...]
List of models. Eeach model will be combined with each param pair. (default: None)
-p [PARAMS [PARAMS ...]], --params [PARAMS [PARAMS ...]]
Arbitrary list of grid search params of the format `key:modifier:values`. Key=hyperparameter name, modifier=Can be either `val` (individual values), `lin` (linspace), or `log` (logspace), followed by the respective values or params for the lin/log space. Examples: num_epochs:val:2,3 or learning_rate:log:-6,-2,4 (default: [])
-g [GLOBALS [GLOBALS ...]], --globals [GLOBALS [GLOBALS ...]]
List of global params which will be passed to all runs of the format `key:value` (default: [])
Common/global parameters are specified with the -g
option, and the grid-searched parameters with the -p
argument.
In order to generate a grid for fasttext hyperparameter of the four hyperparameters (dim
, n_grams
, num_epochs
and learning_rate
).
DATA_PATH=../data/annotation_data/train/category_merged
RUN_PREFIX=category_merged_grid_v1
python main.py generate_config \
--name $RUN_PREFIX \
--train-data train.csv \
--test-data dev.csv \
-m fasttext \
-g write_test_output:true data_path:$DATA_PATH overwrite:true replace_user_with:user replace_url_with:url \
save_model:false min_num_tokens:0 min_num_chars:0 standardize_punctuation:true remove_stop_words:false \
asciify_emojis:false expand_contractions:true remove_emojis:false asciify:false \
-p dim:val:10,20,30,50,100 n_grams:val:1,2,3 num_epochs:val:100,200,300 learning_rate:val:0.06,0.08,0.1,0.12
Running this script should generate a file config.category_merged_grid_v1.json
.
You can now run this config. For FastText we can run this config in parallel:
python main.py train -c config.category_merged_grid_v1.json --parallel