paper整理

area

Machine Learning:Auto-encoder

Oscar Li, Hao Liu, Chaofan Chen, Cynthia Rudin: Deep Learning for Case- Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions. AAAI 2018: 3530-3537

NLP：

Hui Liu, Qingyu Yin, William Yang Wang: Towards Explainable NLP: A Generative Explanation Framework for Text Classification. CoRR abs/1811.00196 (2018)

Computer Vision

Uncertainty Map:
Alex Kendall, Yarin Gal: What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? NIPS 2017: 5580-5590

Saliency Map：
Julius Adebayo, Justin Gilmer, Michael Muelly, Ian J. Goodfellow, Moritz Hardt, Been
Kim: Sanity Checks for Saliency Maps. NeurIPS 2018: 9525-9536

Game Theory

Shapley Additive Explanation
Scott M. Lundberg, Su-In Lee: A Unified Approach to Interpreting Model Predictions. NIPS 2017: 4768- 4777

Planning and Scheduling

XAI Plan:
Rita Borgo, Michael Cashmore, Daniele Magazzeni: Towards Providing Explanations for AI Planner Decisions. CoRR abs/1810.06338 (2018)

Human-in-the-loop Planning:
Maria Fox, Derek Long, Daniele Magazzeni: Explainable Planning. CoRR abs/1709.10256 (2017)

Robo$cs

Narration of Autonomous Robot Experience:
Stephanie Rosenthal, Sai P Selvaraj, and Manuela Veloso. Verbalization: Narration of autonomous robot experience. In IJCAI, pages 862–868. AAAI Press, 2016.

From Decision Tree to human-friendly informationL:
Raymond Ka-Man Sheh: “Why Did You Do That?” Explainable Intelligent Robots. AAAI Workshops 2017

The Need to Explain

• User Acceptance & Trust • Legal
• Conformance to ethical standards, fairness • Right to be informed
• Contestable decisions
• Explanatory Debugging • Flawed performance metrics
• Inadequate features • Distributional drift
• Increase Insightfulness • Informativeness
• Uncovering causality
[Lipton 2016, Ribeiro 2016, Weld and Bansal 2018]
[Goodman and Flaxman 2016, Wachter 2017] [Kulesza et al. 2014, Weld and Bansal 2018]

ACL 2018

Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation. Tiancheng Zhao, Kyusong Lee, Maxine Eskenazi.

Word Embedding and WordNet Based Metaphor Identification and Interpretation. Rui Mao, Chenghua Lin, Frank Guerin.

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder. Ryo Takahashi, Ran Tian, Kentaro Inui.

Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph. AmirAli Bagher Zadeh, Paul Pu Liang, Soujanya Poria, Erik Cambria, Louis-Philippe Morency.

Automatic Estimation of Simultaneous Interpreter Performance. Craig Stewart, Nikolai Vogler, Junjie Hu, Jordan Boyd-Graber, Graham Neubig.

ACL 2017

An Interpretable Knowledge Transfer Model for Knowledge Base Completion Qizhe Xie, Xuezhe Ma, Zihang Dai and Eduard Hovy

Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation Lotem Peled and Roi Reichart

Information-Theory Interpretation of the Skip-Gram Negative-Sampling Objective Function Oren Melamud and Jacob Goldberger

ICLR 2018

Hierarchical and Interpretable Skill Acquisition in Multi-task

Reinforcement Learning
Interpretable Counting for Visual Question Answering

ICML 2018

Varia onal Bayes and Beyond:Bayesian Inference for Big DataTamara Broderick (MIT)
Loca on: Victoria

Learning to Explain: An Information Theoretic Perspective on Model Interpretation
Jianbo Chen, Le Song, Marn Wainwright, Michael Jordan

Discovering Interpretable Representa ons for BothDeep Genera ve and Discrimina ve ModelsTameem Adel, Zoubin Ghahramani, Adrian Weller

Programma cally Interpretable Reinforcement Learning
Abhinav Verma, Vijayaraghavan Murali, Rishabh Singh,Pushmeet Kohli, Swarat Chaudhuri

Differentiable Abstract Interpreta on for Provably RobustNeural Networks
Ma hew Mirman, Timon Gehr, Mar n Veche

Interpretability Beyond Feature A ribu on:Quan ta ve Tes ng with Concept Ac va on Vectors(TCAV)
Been Kim, Mar n Wa enberg, Jus n Gilmer, Carrie Cai,James Wexler, Fernanda B Viégas, Rory sayres

oi-VAE: Output Interpretable VAEs for Nonlinear GroupFactor Analysis
Samuel Ainsworth, Nick J Fo , Adrian KC Lee, Emily Fox

Fairness, Interpretability, and Explainability Federa on of Workshops

NeurIPS 2018

Towards Robust Interpretability with Self-Explaining Neural Networks

Representer Point Selection for Explaining Deep Neural Networks
Explaining Deep Learning Models – A Bayesian Non-parametric Approach

Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Diminishing Returns Shape Constraints for Interpretability and Regularization

Uncertainty-Aware Attention for Reliable Interpretation and Prediction

Integrated Gradients

Axiomatic Attribution for Deep Networks