-
Notifications
You must be signed in to change notification settings - Fork 108
Variable Binning in Shifu
Zhang Pengshan (David) edited this page Dec 29, 2017
·
3 revisions
- EqualInterval
- EqualTotal
- EqualPositive
- EqualNegative
- DynamicBinning
Sort in MR is leveraged, while still performance issue in big data.
- KS Value
- Information Value (IV)
- Woe Transform
- Tree Model Training
If binning is not good in stats step, we can also do rebin to merge some small bins together to make distribution better:
shifu stats -rebin -n 30 -ivr 0.98 -bic 2000