-
Notifications
You must be signed in to change notification settings - Fork 108
Train Regression Model in Shifu
wu haifeng edited this page Mar 1, 2021
·
4 revisions
In most case, Shifu is designed for 0-1 regression, including data binning, data normalization and variable selection. But we can also do Linear Regression using Shifu.
There are two ways to train regression model in Shifu.
- Create a temporary 0-1 target column by using original target (you can decide how to do do that.)
- Run
shifu stats
,shifu norm
,shifu varsel
as normal - After the ColumnConfig.json is generated, and final variables are selected, then change temporary target column to original target column, and remove tags in
posTags
andnegTags
- Add
OutputActivationFunc
to ModelConfig.json -> train -> params. The value ofOutputActivationFunc
could beLinear|ReLU|LeakyReLU|Swish
. Depends on what you need. - Rerun
shifu norm
andshifu train
step to build model
- Keep
posTags
andnegTags
empty in ModelConfig.json. (Attention: "" is not empty, [] is empty.) - Use
EqualTotal
to do binning when runshifu stats
- Use
ONEHOT
orZSCALE_ONEHOT
to do data normalization - Since IV/KS are all zeros, you can use
SE
to do variable selection. Or you can useshifu varsel -f <variables.names.file>
to select variables manually - Add
OutputActivationFunc
to ModelConfig.json -> train -> params. The value ofOutputActivationFunc
could beLinear|ReLU|LeakyReLU|Swish
. Depends on what you need. - Rerun
shifu norm
andshifu train
step to build model
Natively GBDT supports regression if impurity set to variance, please follow the steps above to prepare well before training and then run GBDT 'shifu train' to train a regression model. In 'eval' step, one parameter need to set to avoid sigmoid of final output:
"evals" : [ {
"name" : "Eval1",
"dataSet" : {
...
},
"gbtScoreConvertStrategy" : 'RAW',
...
} ]