Skip to content

Variable Binning in Shifu

Zhang Pengshan (David) edited this page Dec 29, 2017 · 3 revisions

What is Variable Binning?

Binning

  • EqualInterval
  • EqualTotal
  • EqualPositive
  • EqualNegative
  • DynamicBinning

Default Binning Algorithm

Sort in MR is leveraged, while still performance issue in big data.

Default Binning Algorithm

Histogram Binning Algorithm

Histogram Binning Algorithm

Dynamic Binning Algorithm

How Binning is Used in Shifu?

Rebin Support in Shifu

If binning is not good in stats step, we can also do rebin to merge some small bins together to make distribution better:

shifu stats -rebin -n 30 -ivr 0.98 -bic 2000
Clone this wiki locally