-
Notifications
You must be signed in to change notification settings - Fork 108
Grid Search And Random Search Support in Shifu
If you set parameters in train#params to list, training in Shifu will be treated as Grid Search. All combinations will be treated as model training parameters.
"params" : {
"NumHiddenLayers" : [1, 2],
"ActivationFunc" : [ ["tanh"], [ "Sigmoid", "Sigmoid" ] ],
"NumHiddenNodes" : [ [50], [45, 45 ] ],
"LearningRate" : [0.1, 0.2],
"Propagation" : "Q"
},
The hyper parameters (parameters that has multiple possible values) will be sorted in alphabetic order, and then apply index of each hyper parameter list in numeric order to get the final training parameters. Take above params
as an example, the for hyper parameters will be sorted as below.
"ActivationFunc" : [ ["tanh"], [ "Sigmoid", "Sigmoid" ] ]
"LearningRate" : [0.1, 0.2]
"NumHiddenLayers" : [1, 2]
"NumHiddenNodes" : [ [50], [45, 45 ] ]
For the four hyper parameters, the configured parameter value list sizes are 2, 2, 2 and 2. Using list index to represent parameter value, the training parameter combinations will be 0000, 0001, 0010, 0011... The final effective training combinations will be as below.
{"ActivationFunc": ["tanh"], "LearningRate": 0.1, "NumHiddenLayers": 1, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.1, "NumHiddenLayers": 1, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.1, "NumHiddenLayers": 2, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.1, "NumHiddenLayers": 2, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.2, "NumHiddenLayers": 1, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.2, "NumHiddenLayers": 1, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.2, "NumHiddenLayers": 2, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["tanh"], "LearningRate": 0.2, "NumHiddenLayers": 2, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.1, "NumHiddenLayers": 1, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.1, "NumHiddenLayers": 1, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.1, "NumHiddenLayers": 2, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.1, "NumHiddenLayers": 2, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.2, "NumHiddenLayers": 1, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.2, "NumHiddenLayers": 1, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.2, "NumHiddenLayers": 2, "NumHiddenNodes": [50], "Propagation" : "Q"}
{"ActivationFunc": ["Sigmoid", "Sigmoid"], "LearningRate": 0.2, "NumHiddenLayers": 2, "NumHiddenNodes": [45, 45], "Propagation" : "Q"}
You can also use an external file to set Grid Search params. In this case, you need to set train#gridConfigFile to a local file path, relative or absolute. Each line in the configuration file should be : pairs delimited by ';'. And now, only param combinations you list in file will be treated as model training parameters.
FeatureSubsetStrategy:ONETHIRD;MaxDepth:7
FeatureSubsetStrategy:0.5;MaxDepth:6
FeatureSubsetStrategy:0.5;MaxDepth:6:LearningRate:0.04
After a model is trained, last validation error is used to select the best one. At last, when you check your ModelConfig.json with best parameters setting well.
If two many combinations in Shifu grid search and size is over 30, random search is enabled automatically to select by default 30 combinations and then train model and select best parameters by validation error.
Two parameters in shifuconfig can be tuned for grid search and random search.
## each round how many jobs can be running, if bagging is 10, two rounds of bagging with each one 5 guagua jobs
shifu.train.bagging.inparallel=5
## if number of hyper parameter composite over such threshold, random search will be enabled.
shifu.gridsearch.threshold=30
The same as NN, just set parameters in training to be list, training will be treated as grid search.
"params" : {
"TreeNum":[10, 50, 100],
"FeatureSubsetStrategy": ["ALL", 'ONETHIRD', "HALF"],
"MaxDepth": 8,
"MaxStatsMemoryMB": 256,
"Impurity":"variance",
"LearningRate": [0.1, 0.5, 0.05],
"MinInstancesPerNode": 1,
"MinInfoGain": 0.0,
"Loss": "squared"
},
You can also put grid search param combinations in a file, e.g. grid.conf in below.
"train" : {
"baggingNum" : 1,
"baggingWithReplacement" : false,
"baggingSampleRate" : 1.0,
"validSetRate" : 0.1,
"numTrainEpochs" : 1500,
"isContinuous" : false,
"workerThreadCount" : 4,
"algorithm" : "GBT",
"gridConfigFile" : "grid.conf",
"params" : {
"MaxDepth": 8,
"MaxStatsMemoryMB": 256,
"Impurity":"variance",
"MinInstancesPerNode": 1,
"MinInfoGain": 0.0,
"Loss": "squared"
},
"customPaths" : {}
},
Contents in grid.conf file can be this format:
TreeNum:10;FeatureSubsetStrategy:ALL;LearningRate:0.1
TreeNum:10;FeatureSubsetStrategy:ALL;LearningRate:0.5
TreeNum:50;FeatureSubsetStrategy:ONETHIRD;LearningRate:0.5
TreeNum:100;FeatureSubsetStrategy:HALF;LearningRate:0.05
And train#params would hold common/default param values. Even your set grid search format in below, only combinations in the file are supported.
"params" : {
"MaxDepth": 8,
"MaxStatsMemoryMB": 256,
"Impurity":"variance",
"MinInstancesPerNode": 1,
"MinInfoGain": 0.0,
"Loss": "squared"
},