area
Machine Learning:Auto-encoder
Oscar Li, Hao Liu, Chaofan Chen, Cynthia Rudin: Deep Learning for Case- Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions. AAAI 2018: 3530-3537
NLP:
Hui Liu, Qingyu Yin, William Yang Wang: Towards Explainable NLP: A Generative Explanation Framework for Text Classification. CoRR abs/1811.00196 (2018)
Computer Vision
Uncertainty Map:
Alex Kendall, Yarin Gal: What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? NIPS 2017: 5580-5590
Saliency Map:
Julius Adebayo, Justin Gilmer, Michael Muelly, Ian J. Goodfellow, Moritz Hardt, Been
Kim: Sanity Checks for Saliency Maps. NeurIPS 2018: 9525-9536
Game Theory
Shapley Additive Explanation
Scott M. Lundberg, Su-In Lee: A Unified Approach to Interpreting Model Predictions. NIPS 2017: 4768- 4777
Planning and Scheduling
XAI Plan:
Rita Borgo, Michael Cashmore, Daniele Magazzeni: Towards Providing Explanations for AI Planner Decisions. CoRR abs/1810.06338 (2018)
Human-in-the-loop Planning:
Maria Fox, Derek Long, Daniele Magazzeni: Explainable Planning. CoRR abs/1709.10256 (2017)
Robo$cs
Narration of Autonomous Robot Experience:
Stephanie Rosenthal, Sai P Selvaraj, and Manuela Veloso. Verbalization: Narration of autonomous robot experience. In IJCAI, pages 862–868. AAAI Press, 2016.
From Decision Tree to human-friendly informationL:
Raymond Ka-Man Sheh: “Why Did You Do That?” Explainable Intelligent Robots. AAAI Workshops 2017
The Need to Explain
• User Acceptance & Trust • Legal
• Conformance to ethical standards, fairness • Right to be informed
• Contestable decisions
• Explanatory Debugging • Flawed performance metrics
• Inadequate features • Distributional drift
• Increase Insightfulness • Informativeness
• Uncovering causality
[Lipton 2016, Ribeiro 2016, Weld and Bansal 2018]
[Goodman and Flaxman 2016, Wachter 2017] [Kulesza et al. 2014, Weld and Bansal 2018]
ACL 2018
Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation. Tiancheng Zhao, Kyusong Lee, Maxine Eskenazi.
Word Embedding and WordNet Based Metaphor Identification and Interpretation. Rui Mao, Chenghua Lin, Frank Guerin.
Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder. Ryo Takahashi, Ran Tian, Kentaro Inui.
Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph. AmirAli Bagher Zadeh, Paul Pu Liang, Soujanya Poria, Erik Cambria, Louis-Philippe Morency.
Automatic Estimation of Simultaneous Interpreter Performance. Craig Stewart, Nikolai Vogler, Junjie Hu, Jordan Boyd-Graber, Graham Neubig.
ACL 2017
An Interpretable Knowledge Transfer Model for Knowledge Base Completion Qizhe Xie, Xuezhe Ma, Zihang Dai and Eduard Hovy
Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation Lotem Peled and Roi Reichart
Information-Theory Interpretation of the Skip-Gram Negative-Sampling Objective Function Oren Melamud and Jacob Goldberger
ICLR 2018
Hierarchical and Interpretable Skill Acquisition in Multi-task
Reinforcement Learning
Interpretable Counting for Visual Question Answering
ICML 2018
Varia onal Bayes and Beyond:Bayesian Inference for Big DataTamara Broderick (MIT)
Loca on: Victoria
Learning to Explain: An Information Theoretic Perspective on Model Interpretation
Jianbo Chen, Le Song, Marn Wainwright, Michael Jordan
Discovering Interpretable Representa ons for BothDeep Genera ve and Discrimina ve ModelsTameem Adel, Zoubin Ghahramani, Adrian Weller
Programma cally Interpretable Reinforcement Learning
Abhinav Verma, Vijayaraghavan Murali, Rishabh Singh,Pushmeet Kohli, Swarat Chaudhuri
Differentiable Abstract Interpreta on for Provably RobustNeural Networks
Ma hew Mirman, Timon Gehr, Mar n Veche
Interpretability Beyond Feature A ribu on:Quan ta ve Tes ng with Concept Ac va on Vectors(TCAV)
Been Kim, Mar n Wa enberg, Jus n Gilmer, Carrie Cai,James Wexler, Fernanda B Viégas, Rory sayres
oi-VAE: Output Interpretable VAEs for Nonlinear GroupFactor Analysis
Samuel Ainsworth, Nick J Fo , Adrian KC Lee, Emily Fox
Fairness, Interpretability, and Explainability Federa on of Workshops
NeurIPS 2018
Towards Robust Interpretability with Self-Explaining Neural Networks
Representer Point Selection for Explaining Deep Neural Networks
Explaining Deep Learning Models – A Bayesian Non-parametric Approach
Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections
Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability
Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples
Learning Conditioned Graph Structures for Interpretable Visual Question Answering
Diminishing Returns Shape Constraints for Interpretability and Regularization
Uncertainty-Aware Attention for Reliable Interpretation and Prediction
Integrated Gradients
Axiomatic Attribution for Deep Networks