[Summary] Explanation Operations #13

nfelnlp · 2023-03-23T20:42:41Z

Operation	Terminals / Prompts	Action	Description	Tools	Status
nlpattribute	nlpattribute token \| phrase \| sentence {classes}	feature_importance	Provides feature importances at the token (default), phrase or sentence level.	Captum (Integrated Gradients)	✅
globaltopk	important {number} {classes}	global_topk	Returns top k most attributed tokens across the entire dataset.	Captum (Integrated Gradients)	✅
nlpcfe	nlpcfe {number}	counterfactuals	Returns counterfactual explanations (model predicts another label) for a single instance.	Polyjuice	✅
adversarial	adversarial {number}		Returns adversarial examples (model predicts wrong label) for a single instance.	OpenAttack	✅
similar	similar {number}	similarity	Gets number of training data instances that are most similar to the current one.	Sentence Transformers	✅
rules	rules {number}		Outputs the decision rules for the dataset.	Anchors	⛔
interact	interact		Gets feature interactions.	HEDGE	⛔
rationalize	rationalize	rationalize	Explains the prediction for some specified instance in natural language.	Zero-shot prompting with GPTNeo parser	✅

nfelnlp · 2023-03-30T10:07:49Z

adversarial (via OpenAttack) has more than twice the execution time than Polyjuice which already takes quite a while. Since CFEs already cover a similar operation, OpenAttack is not part of the roadmap anymore. The long-term plan is to train one multi-purpose model than can reasonably perturb text for generating adversarial attacks, counterfactuals and general data augmentation at once.

interact (via HEDGE) is not possible to implement, because hierarchical explanations don't have an obvious natural language representation. Visualizations are not part of the agenda as of now.

rules (via Anchors) does not appear to return rules that inherently make sense (mostly single tokens) and takes very long to compute.

rationalize (via OpenAI API or a rationalizing LLM) will be implemented soon.

nfelnlp · 2023-04-13T14:13:22Z

For rationalize, we can do the following:

Design one prompt for each dataset
Insert the input texts into the prompts
Use GPT-3.5 / -4 to generate a few hundred rationales in a zero-shot setup
Fine-tune a T5 for each dataset of rationales
Run inference with the fine-tuned T5 to produce rationales for the rest of the datasets (because using ChatGPT for the tens of thousands of examples in BoolQ, OLID & DD is too expensive)
Store generated rationales as CSVs or JSONs (see pre-computed feature attribution explanations in the cache folder for reference)

nfelnlp added the summary label Mar 23, 2023

nfelnlp modified the milestones: BoolQ Prototype, EMNLP submission Mar 23, 2023

nfelnlp added the enhancement New feature or request label May 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Summary] Explanation Operations #13

[Summary] Explanation Operations #13

nfelnlp commented Mar 23, 2023 •

edited

Loading

nfelnlp commented Mar 30, 2023 •

edited

Loading

nfelnlp commented Apr 13, 2023

[Summary] Explanation Operations #13

[Summary] Explanation Operations #13

Comments

nfelnlp commented Mar 23, 2023 • edited Loading

nfelnlp commented Mar 30, 2023 • edited Loading

nfelnlp commented Apr 13, 2023

nfelnlp commented Mar 23, 2023 •

edited

Loading

nfelnlp commented Mar 30, 2023 •

edited

Loading