Skip to content

Latest commit

 

History

History
58 lines (46 loc) · 1.81 KB

README.md

File metadata and controls

58 lines (46 loc) · 1.81 KB

Xgbfi

XGBoost Feature Interactions & Importance

What is Xgbfi?

Xgbfi is a XGBoost model dump parser, which ranks features as well as feature interactions by different metrics.

Siblings

Xgbfir - Python porting

The Metrics

  • Gain: Total gain of each feature or feature interaction
  • FScore: Amount of possible splits taken on a feature or feature interaction
  • wFScore: Amount of possible splits taken on a feature or feature interaction weighted by the probability of the splits to take place
  • Average wFScore: wFScore divided by FScore
  • Average Gain: Gain divided by FScore
  • Expected Gain: Total gain of each feature or feature interaction weighted by the probability to gather the gain
  • Average Tree Index
  • Average Tree Depth

Additional Features

  • Leaf Statistics
  • Split Value Histograms

Example:

Usage

[mono] XgbFeatureInteractions.exe [-help|options]

Quick Guide

a) Creating a feature map (fmap)

def create_feature_map(fmap_filename, features):
"""
features: enumerable of feature names
"""
    outfile = open(fmap_filename, 'w')
    for i, feat in enumerate(features):
        outfile.write('{0}\t{1}\tq\n'.format(i, feat))
    outfile.close()
    
create_feature_map('xgb.fmap', features) 

b) Dumping a XGBoost model

gbdt.dump_model('xgb.dump',fmap='xgb.fmap', with_stats=True)

c) Editing Parameters in XgbFeatureInteractions.exe.config

<setting name="XgbModelFile" serializeAs="String">
    <value>xgb.dump</value>
</setting>

d) Running [mono] XgbFeatureInteractions.exe without cmd line parameters