-
Notifications
You must be signed in to change notification settings - Fork 108
Shifu 0.2.5 Support Missing Value As a Bin
After 'shifu stats', check ColumnConfig.json, you will find one more bin at last in ColumnStats:binCountPos and ColumnStats:binCountNeg. This is missing value Bin.
Please be noted binCountPos and binCountNeg will be size of binBounaries + 1.
New KS, IV value will be computed based on new binCountNeg and binCountPos (including missing value count).
In Shifu 0.2.5, missing value is simple. For numeric column, 'null', emptry and non-number value will be set as missig value. In next version, we may have good definition on missing value and even to let user specify the rules.
Two lists in ColumnStats are added for weighted negative and positive values. So in each column, we will have two woe value. One is woe for count, the other one is woe for weight.