Skip to content

Commit

Permalink
Merge branch 'main' into dr_clipping
Browse files Browse the repository at this point in the history
  • Loading branch information
FrancescMartiEscofetQC authored Jul 10, 2024
2 parents 9e84884 + f7a64e6 commit a980caf
Showing 1 changed file with 14 additions and 27 deletions.
41 changes: 14 additions & 27 deletions docs/examples/example_lime.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -217,17 +217,21 @@
"source": [
"### Generating lime plots\n",
"\n",
"``lime`` will expect a function which consumes an ``X`` and returns\n",
"``lime`` will expect a function which consumes a ``np.ndarray`` ``X`` and returns\n",
"a one-dimensional vector of the same length as ``X``. We'll have to\n",
"adapt the {meth}`~metalearners.rlearner.RLearner.predict` method of\n",
"our {class}`~metalearners.rlearner.RLearner` in two ways:\n",
"our {class}`~metalearners.rlearner.RLearner` in three ways:\n",
"\n",
"* We need to pass a value for the necessary parameter ``is_oos`` to {meth}`~metalearners.rlearner.RLearner.predict`.\n",
"\n",
"* We need to reshape the output of\n",
" {meth}`~metalearners.rlearner.RLearner.predict` to be one-dimensional. This\n",
" we can easily achieve via {func}`metalearners.utils.simplify_output`.\n",
"\n",
"* We need to reconvert the ``np.ndarray`` to a ``pd.DataFrame`` to work with categoricals\n",
" and specify the correct categories so the categorical codes are the same (which are used internally in LightGBM),\n",
" see [this issue](https://github.com/microsoft/LightGBM/issues/5162) for more context.\n",
"\n",
"This we can do as follows:"
]
},
Expand All @@ -244,7 +248,11 @@
"from metalearners.utils import simplify_output\n",
"\n",
"def predict(X):\n",
" return simplify_output(rlearner.predict(X, is_oos=True))"
" X_pd = pd.DataFrame(X, copy=True)\n",
" for c in X_pd.columns:\n",
" # This line sets the cat.categories correctly (even if not all are present in X)\n",
" X_pd[c] = X_pd[c].astype(df[feature_columns].iloc[:, c].dtype)\n",
" return simplify_output(rlearner.predict(X_pd, is_oos=True))"
]
},
{
Expand All @@ -254,26 +262,7 @@
"where we set ``is_oos=True`` since ``lime`` will call\n",
"{meth}`~metalearners.rlearner.RLearner.predict`\n",
"with various inputs which will not be able to be recognized as\n",
"in-sample data.\n",
"\n",
"Since ``lime`` expects ``numpy`` datastructures, we'll have to\n",
"manually encode the categorical features of our ``pandas`` data\n",
"structure, see [this issue](https://github.com/microsoft/LightGBM/issues/5162) for more context."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"X = df[feature_columns].copy()\n",
"for categorical_feature_column in categorical_feature_columns:\n",
" X[categorical_feature_column] = X[categorical_feature_column].cat.codes"
"in-sample data."
]
},
{
Expand Down Expand Up @@ -332,10 +321,8 @@
"from lime.lime_tabular import LimeTabularExplainer\n",
"from lime.submodular_pick import SubmodularPick\n",
"\n",
"X = X.to_numpy()\n",
"\n",
"explainer = LimeTabularExplainer(\n",
" X,\n",
" df[feature_columns].to_numpy(),\n",
" feature_names=feature_columns,\n",
" categorical_features=categorical_feature_indices,\n",
" categorical_names=categorical_names,\n",
Expand All @@ -345,7 +332,7 @@
")\n",
"\n",
"sp = SubmodularPick(\n",
" data=X,\n",
" data=df[feature_columns].to_numpy(),\n",
" explainer=explainer,\n",
" predict_fn=predict,\n",
" method=\"sample\",\n",
Expand Down

0 comments on commit a980caf

Please sign in to comment.