From fb2452abca9ae21dfc969ba84b06c46008653f23 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 10:55:57 +0200 Subject: [PATCH 01/12] Improve LIME --- docs/examples/example_lime.ipynb | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/docs/examples/example_lime.ipynb b/docs/examples/example_lime.ipynb index cf22a6f..920f06f 100644 --- a/docs/examples/example_lime.ipynb +++ b/docs/examples/example_lime.ipynb @@ -54,7 +54,7 @@ "* {math}`f`, the original model -- in our case the MetaLearner\n", "* {math}`G`, the class of possible, interpretable surrogate models\n", "* {math}`\\Omega(g)`, a measure of complexity for {math}`g \\in G`\n", - "* {math}`\\pi_x(z)` a proximity measure of {math}`z` with respect to data point {math}`x`\n", + "* {math}`\\pi_x(z)` a proximity measure of an instance {math}`z` with respect to data point {math}`x`\n", "* {math}`\\mathcal{L}(f, g, \\pi_x)` a measure of how unfaithful a {math}`g \\in G` is to {math}`f` in the locality defined by {math}`\\pi_x`\n", "\n", "Given all of these objects as well as a to be explained data point {math}`x`, the authors suggest that the most appropriate surrogate {math}`g`, also referred to as explanation for {math}`x`, {math}`\\xi(x)`, can be expressed as follows:\n", @@ -74,10 +74,10 @@ "* showcase the features with highest global importance\n", "\n", "In line with this ambition, they define a notion of 'coverage' -- to\n", - "be maximized --as follows:\n", + "be maximized -- with respect to a set of explanations {math}`V`, as follows:\n", "\n", "```{math}\n", - " c(V, W, \\mathcal{I}) = \\sum_{j=1}^{d} I[\\exists i \\in V: W_{i,j} > 0] \\mathcal{I}_j\n", + " c(V, W, \\mathcal{I}) = \\sum_{j=1}^{d} \\mathbb{I}\\{\\exists i \\in V: W_{i,j} > 0\\} \\mathcal{I}_j\n", "````\n", "\n", "where\n", @@ -85,7 +85,8 @@ "* {math}`d` is the number of features\n", "* {math}`V` is the candidate set of explanations to be shown to\n", " humans, within a fixed budget -- this is the variable to be optimized\n", - "* {math}`W` is a {math}`n \\times d` local feature importance matrix and\n", + "* {math}`W` is a {math}`n \\times d` local feature importance matrix that represents\n", + " the local importance of each feature for each instance, and\n", "* {math}`\\mathcal{I}` is a {math}`d`-dimensional vector of global\n", " feature importances\n", "\n", @@ -359,7 +360,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "For guidelines on how to interpret such lime plots please see the [lime documentation](https://github.com/marcotcr/lime)." + "In these plots, the green bars signify that the corresponding feature referenced on the\n", + "y-axis, positively impacts CATE estimation. On the other side, the red bars, represent\n", + "features that negatively impact the CATE estimation.\n", + "Furthermore, the length of these coloured bars corresponds to the magnitude of each feature's\n", + "contribution towards the model prediction. Therefore, the longer the bar, the more\n", + "significant the impact of that feature on the overall model prediction.\n", + "\n", + "For more guidelines on how to interpret such lime plots please see the [lime documentation](https://github.com/marcotcr/lime)." ] } ], From 47526c1f172eeb3390b801058973f0ad342f725b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 11:05:44 +0200 Subject: [PATCH 02/12] Improve SHAP --- .../example_feature_importance_shap.ipynb | 21 +++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/docs/examples/example_feature_importance_shap.ipynb b/docs/examples/example_feature_importance_shap.ipynb index 6e2a662..5d710d5 100644 --- a/docs/examples/example_feature_importance_shap.ipynb +++ b/docs/examples/example_feature_importance_shap.ipynb @@ -326,7 +326,8 @@ "source": [ "Note that the method {meth}`~metalearners.explainer.Explainer.feature_importances`\n", "returns a list of length {math}`n_{variats} -1` that indicates the feature importance for\n", - "each variant against control.\n", + "each variant against control. Remember that a higher value means that the corresponding\n", + "feature is more important for the CATE prediction.\n", "\n", "### Computing and plotting the SHAP values\n", "\n", @@ -367,7 +368,23 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "For guidelines on how to interpret such SHAP plots please see the [SHAP documentation](https://github.com/shap/shap).\n", + "In these SHAP summary plots, the color and orientation of the plotted values help understand\n", + "their impact on model predictions.\n", + "\n", + "Each dot in the plot represents a single instance of the given feature present in the data set.\n", + "The x-axis conveys the Shapley value, signifying the strength and directionality of the\n", + "feature's impact. The y-axis displays each feature in the model.\n", + "\n", + "The color coding implemented in these plots is straightforward: red implies a high feature value,\n", + "while blue denotes a low feature value. This color scheme assists in identifying whether\n", + "high or low values of a certain feature influence the model's output positively or negatively.\n", + "The categorical variables are grey colored.\n", + "\n", + "The Shapley value, exhibited on the horizontal axis, is oriented such that values on the\n", + "right of the center line (0 mark) contribute to a positive shift in the predicted outcome,\n", + "while those on the left indicate a negative impact.\n", + "\n", + "For more guidelines on how to interpret such SHAP plots please see the [SHAP documentation](https://github.com/shap/shap).\n", "\n", "Note that the method {meth}`~metalearners.explainer.Explainer.shap_values`\n", "returns a list of length {math}`n_{variats} -1` that indicates the SHAP values for\n", From 0b4166b6e46e3a6be62b6cafe33397a6ebabce25 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 11:11:42 +0200 Subject: [PATCH 03/12] Improve FAQ and glossary --- docs/faq.rst | 3 ++- docs/glossary.rst | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/faq.rst b/docs/faq.rst index 13d1de2..b149eab 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -29,7 +29,8 @@ FAQ Double machine learning is an ATE estimation technique, pioneered by `Chernozhukov et al. (2016) `_. It is 'double' in the sense that it relies on two preliminary models: one for the probability of - receiving treatment given covariates (the propensity score), and one for the outcome given treatment and covariates. + receiving treatment given covariates (the propensity score), and one for the outcome covariates and + optionally the treatment. Double ML is also referred to as 'debiased' ML, since the propensity score model is used to 'debias' a naive estimator that uses the outcome model to predict the expected outcome under treatment, and under no treatment, diff --git a/docs/glossary.rst b/docs/glossary.rst index d9232cd..39c2c13 100644 --- a/docs/glossary.rst +++ b/docs/glossary.rst @@ -24,8 +24,8 @@ Glossary Similar to the R-Learner, the Double Machine Learning blueprint relies on estimating two nuisance models in its first stage: a propensity model as well as an outcome model. Unlike the - R-Learner, the last-stage or treatment effect model might not - be any estimator. + R-Learner, the last-stage or treatment effect model might need a + specific type of estimator. See `Chernozhukov et al. (2016) `_. Heterogeneous Treatment Effect (HTE) From 36c10f564cbc34816ecab5de45b7f4b85fec334d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 11:14:04 +0200 Subject: [PATCH 04/12] Improve glossary --- docs/glossary.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/glossary.rst b/docs/glossary.rst index 39c2c13..cd3bf77 100644 --- a/docs/glossary.rst +++ b/docs/glossary.rst @@ -24,7 +24,7 @@ Glossary Similar to the R-Learner, the Double Machine Learning blueprint relies on estimating two nuisance models in its first stage: a propensity model as well as an outcome model. Unlike the - R-Learner, the last-stage or treatment effect model might need a + R-Learner, the last-stage or treatment effect model might need to be a specific type of estimator. See `Chernozhukov et al. (2016) `_. From 209ea06a87c2ccb91b88f4fb40beac422a2ef5fe Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 11:17:43 +0200 Subject: [PATCH 05/12] Improve basic example --- docs/examples/example_basic.ipynb | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/examples/example_basic.ipynb b/docs/examples/example_basic.ipynb index f7135f8..33426c0 100644 --- a/docs/examples/example_basic.ipynb +++ b/docs/examples/example_basic.ipynb @@ -115,7 +115,9 @@ "* We need to specify the observed treatment assignment ``w`` in the call to the\n", " ``fit`` method.\n", "* We need to specify whether we want in-sample or out-of-sample\n", - " CATE estimates in the {meth}`~metalearners.TLearner.predict` call via ``is_oos``." + " CATE estimates in the {meth}`~metalearners.TLearner.predict` call via ``is_oos``. In the\n", + " case of in-sample predictions, the data passed to {meth}`~metalearners.TLearner.predict`\n", + " must be exactly the same as the data that was used to call {meth}`~metalearners.TLearner.fit`." ] }, { From cee1ea6cf29c6eb1f84e50c0572e176b54368814 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= <154450563+FrancescMartiEscofetQC@users.noreply.github.com> Date: Mon, 24 Jun 2024 11:46:45 +0200 Subject: [PATCH 06/12] Update docs/examples/example_feature_importance_shap.ipynb Co-authored-by: Kevin Klein <7267523+kklein@users.noreply.github.com> --- docs/examples/example_feature_importance_shap.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/examples/example_feature_importance_shap.ipynb b/docs/examples/example_feature_importance_shap.ipynb index 5d710d5..33bdc86 100644 --- a/docs/examples/example_feature_importance_shap.ipynb +++ b/docs/examples/example_feature_importance_shap.ipynb @@ -378,7 +378,7 @@ "The color coding implemented in these plots is straightforward: red implies a high feature value,\n", "while blue denotes a low feature value. This color scheme assists in identifying whether\n", "high or low values of a certain feature influence the model's output positively or negatively.\n", - "The categorical variables are grey colored.\n", + "The categorical variables are colored in grey.\n", "\n", "The Shapley value, exhibited on the horizontal axis, is oriented such that values on the\n", "right of the center line (0 mark) contribute to a positive shift in the predicted outcome,\n", From 9cdab1a0ad760e0998c7f3bb9e841db2e81791da Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= <154450563+FrancescMartiEscofetQC@users.noreply.github.com> Date: Mon, 24 Jun 2024 11:47:01 +0200 Subject: [PATCH 07/12] Update docs/examples/example_lime.ipynb Co-authored-by: Kevin Klein <7267523+kklein@users.noreply.github.com> --- docs/examples/example_lime.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/examples/example_lime.ipynb b/docs/examples/example_lime.ipynb index 920f06f..5b1a9fd 100644 --- a/docs/examples/example_lime.ipynb +++ b/docs/examples/example_lime.ipynb @@ -361,7 +361,7 @@ "metadata": {}, "source": [ "In these plots, the green bars signify that the corresponding feature referenced on the\n", - "y-axis, positively impacts CATE estimation. On the other side, the red bars, represent\n", + "y-axis, increases the CATE estimate. On the other side, the red bars, represent\n", "features that negatively impact the CATE estimation.\n", "Furthermore, the length of these coloured bars corresponds to the magnitude of each feature's\n", "contribution towards the model prediction. Therefore, the longer the bar, the more\n", From 3c11fdcb0cc6fca6c7d87fbf7baf3f7bad7fdc23 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= <154450563+FrancescMartiEscofetQC@users.noreply.github.com> Date: Mon, 24 Jun 2024 11:47:06 +0200 Subject: [PATCH 08/12] Update docs/examples/example_lime.ipynb Co-authored-by: Kevin Klein <7267523+kklein@users.noreply.github.com> --- docs/examples/example_lime.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/examples/example_lime.ipynb b/docs/examples/example_lime.ipynb index 5b1a9fd..9b9c54a 100644 --- a/docs/examples/example_lime.ipynb +++ b/docs/examples/example_lime.ipynb @@ -362,7 +362,7 @@ "source": [ "In these plots, the green bars signify that the corresponding feature referenced on the\n", "y-axis, increases the CATE estimate. On the other side, the red bars, represent\n", - "features that negatively impact the CATE estimation.\n", + "features that reduce CATE estimates.\n", "Furthermore, the length of these coloured bars corresponds to the magnitude of each feature's\n", "contribution towards the model prediction. Therefore, the longer the bar, the more\n", "significant the impact of that feature on the overall model prediction.\n", From 1af3dd6c8153a1acc39a59a7e1c3fb942868c40a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= <154450563+FrancescMartiEscofetQC@users.noreply.github.com> Date: Mon, 24 Jun 2024 11:47:11 +0200 Subject: [PATCH 09/12] Update docs/examples/example_lime.ipynb Co-authored-by: Kevin Klein <7267523+kklein@users.noreply.github.com> --- docs/examples/example_lime.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/examples/example_lime.ipynb b/docs/examples/example_lime.ipynb index 9b9c54a..2b7f819 100644 --- a/docs/examples/example_lime.ipynb +++ b/docs/examples/example_lime.ipynb @@ -363,7 +363,7 @@ "In these plots, the green bars signify that the corresponding feature referenced on the\n", "y-axis, increases the CATE estimate. On the other side, the red bars, represent\n", "features that reduce CATE estimates.\n", - "Furthermore, the length of these coloured bars corresponds to the magnitude of each feature's\n", + "Furthermore, the length of these colored bars corresponds to the magnitude of each feature's\n", "contribution towards the model prediction. Therefore, the longer the bar, the more\n", "significant the impact of that feature on the overall model prediction.\n", "\n", From a2da31994fbbfc7d4a81eb4c742209aad2e274df Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= <154450563+FrancescMartiEscofetQC@users.noreply.github.com> Date: Mon, 24 Jun 2024 11:47:24 +0200 Subject: [PATCH 10/12] Update docs/examples/example_lime.ipynb Co-authored-by: Kevin Klein <7267523+kklein@users.noreply.github.com> --- docs/examples/example_lime.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/examples/example_lime.ipynb b/docs/examples/example_lime.ipynb index 2b7f819..4a8a516 100644 --- a/docs/examples/example_lime.ipynb +++ b/docs/examples/example_lime.ipynb @@ -365,7 +365,7 @@ "features that reduce CATE estimates.\n", "Furthermore, the length of these colored bars corresponds to the magnitude of each feature's\n", "contribution towards the model prediction. Therefore, the longer the bar, the more\n", - "significant the impact of that feature on the overall model prediction.\n", + "significant the impact of that feature on the model prediction.\n", "\n", "For more guidelines on how to interpret such lime plots please see the [lime documentation](https://github.com/marcotcr/lime)." ] From 8f926921a03899faff28a764bd0effe01ac13484 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 11:49:55 +0200 Subject: [PATCH 11/12] Reorder paragraphs shap --- docs/examples/example_feature_importance_shap.ipynb | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/examples/example_feature_importance_shap.ipynb b/docs/examples/example_feature_importance_shap.ipynb index 33bdc86..9a0c8f4 100644 --- a/docs/examples/example_feature_importance_shap.ipynb +++ b/docs/examples/example_feature_importance_shap.ipynb @@ -375,15 +375,15 @@ "The x-axis conveys the Shapley value, signifying the strength and directionality of the\n", "feature's impact. The y-axis displays each feature in the model.\n", "\n", + "The Shapley value, exhibited on the horizontal axis, is oriented such that values on the\n", + "right of the center line (0 mark) contribute to a positive shift in the predicted outcome,\n", + "while those on the left indicate a negative impact.\n", + "\n", "The color coding implemented in these plots is straightforward: red implies a high feature value,\n", "while blue denotes a low feature value. This color scheme assists in identifying whether\n", "high or low values of a certain feature influence the model's output positively or negatively.\n", "The categorical variables are colored in grey.\n", "\n", - "The Shapley value, exhibited on the horizontal axis, is oriented such that values on the\n", - "right of the center line (0 mark) contribute to a positive shift in the predicted outcome,\n", - "while those on the left indicate a negative impact.\n", - "\n", "For more guidelines on how to interpret such SHAP plots please see the [SHAP documentation](https://github.com/shap/shap).\n", "\n", "Note that the method {meth}`~metalearners.explainer.Explainer.shap_values`\n", From a9b26e8b7b4304a0858c0c813baf591cf5ef0e9b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Francesc=20Mart=C3=AD=20Escofet?= Date: Mon, 24 Jun 2024 11:51:19 +0200 Subject: [PATCH 12/12] Use subset instead of all features --- docs/examples/example_feature_importance_shap.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/examples/example_feature_importance_shap.ipynb b/docs/examples/example_feature_importance_shap.ipynb index 9a0c8f4..4a4a2b8 100644 --- a/docs/examples/example_feature_importance_shap.ipynb +++ b/docs/examples/example_feature_importance_shap.ipynb @@ -373,7 +373,7 @@ "\n", "Each dot in the plot represents a single instance of the given feature present in the data set.\n", "The x-axis conveys the Shapley value, signifying the strength and directionality of the\n", - "feature's impact. The y-axis displays each feature in the model.\n", + "feature's impact. The y-axis displays a subset of the features in the model.\n", "\n", "The Shapley value, exhibited on the horizontal axis, is oriented such that values on the\n", "right of the center line (0 mark) contribute to a positive shift in the predicted outcome,\n",