Skip to content

Latest commit

 

History

History
57 lines (57 loc) · 2.16 KB

2021-07-01-agarwal21c.md

File metadata and controls

57 lines (57 loc) · 2.16 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations
As machine learning black boxes are increasingly being deployed in critical domains such as healthcare and criminal justice, there has been a growing emphasis on developing techniques for explaining these black boxes in a post hoc manner. In this work, we analyze two popular post hoc interpretation techniques: SmoothGrad which is a gradient based method, and a variant of LIME which is a perturbation based method. More specifically, we derive explicit closed form expressions for the explanations output by these two methods and show that they both converge to the same explanation in expectation, i.e., when the number of perturbed samples used by these methods is large. We then leverage this connection to establish other desirable properties, such as robustness, for these techniques. We also derive finite sample complexity bounds for the number of perturbations required for these methods to converge to their expected explanation. Finally, we empirically validate our theory using extensive experimentation on both synthetic and real-world datasets.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
agarwal21c
0
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations
110
119
110-119
110
false
Agarwal, Sushant and Jabbari, Shahin and Agarwal, Chirag and Upadhyay, Sohini and Wu, Steven and Lakkaraju, Himabindu
given family
Sushant
Agarwal
given family
Shahin
Jabbari
given family
Chirag
Agarwal
given family
Sohini
Upadhyay
given family
Steven
Wu
given family
Himabindu
Lakkaraju
2021-07-01
Proceedings of the 38th International Conference on Machine Learning
139
inproceedings
date-parts
2021
7
1