Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications

Wojciech Samek, Grégoire Montavon, Sebastian Lapuschkin, Christopher J. Anders, Klaus‐Robert Müller

Type: Review

Publication Date: 2021-03-01

Citations: 148

DOI: https://doi.org/10.1109/jproc.2021.3060483

View Chat PDF

Abstract

With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for Explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning, in particular, deep neural networks, are therefore receiving increased attention. In this work we aim to (1) provide a timely overview of this active emerging field, with a focus on 'post-hoc' explanations, and explain its theoretical foundations, (2) put interpretability algorithms to a test both from a theory and comparative evaluation perspective using extensive simulations, (3) outline best practice aspects i.e. how to best include interpretation methods into the standard usage of machine learning and (4) demonstrate successful usage of explainable AI in a representative selection of application scenarios. Finally, we discuss challenges and possible future directions of this exciting foundational field of machine learning.

Locations

arXiv (Cornell University) - View - PDF
DataCite API - View
Proceedings of the IEEE - View - PDF

Similar Works

Action	Title	Year	Authors
+	Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond	2020	Wojciech Samek Grégoire Montavon Sebastian Lapuschkin Christopher J. Anders Klaus‐Robert Müller
+	Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey	2020	Arun Das Paul Rad
+	Hide-and-Seek: A Template for Explainable AI	2020	Thanos Tagaris Andreas Stafylopatis
+ PDF Chat	Explaining Deep Neural Networks by Leveraging Intrinsic Methods	2024	Biagio La Rosa
+ PDF Chat	Explaining Explanations: An Overview of Interpretability of Machine Learning	2018	Leilani H. Gilpin David Bau Ben Z. Yuan Ayesha Bajwa Michael A. Specter Lalana Kagal
+	Explaining Explanations: An Overview of Interpretability of Machine Learning	2018	Leilani H. Gilpin David Bau Ben Z. Yuan Ayesha Bajwa Michael Specter Lalana Kagal
+	Explaining Explanations: An Overview of Interpretability of Machine Learning	2018	Leilani H. Gilpin David Bau Ben Z. Yuan Ayesha Bajwa Michael A. Specter Lalana Kagal
+ PDF Chat	Explainable Deep Learning: A Field Guide for the Uninitiated	2022	Gabriëlle Ras Ning Xie Marcel van Gerven Derek Doran
+	Explainable Artificial Intelligence: a Systematic Review	2020	Giulia Vilone Luca Longo
+	Explainable Artificial Intelligence: a Systematic Review	2020	Giulia Vilone Luca Longo
+	Deep Learning Reproducibility and Explainable AI (XAI)	2022	Anastasia Leventi-Peetz T. Östreich
+	Explainable Deep Learning: A Field Guide for the Uninitiated	2020	Gabriëlle Ras Ning Xie Marcel van Gerven Derek Doran
+	Explaining Explanations: An Approach to Evaluating Interpretability of Machine Learning	2018	Leilani H. Gilpin David Bau Ben Z. Yuan Ayesha Bajwa Michael A. Specter Lalana Kagal
+	Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning	2022	Yuyang Gao Siyi Gu Junji Jiang Sung-Soo Hong Dazhou Yu Liang Zhao
+ PDF Chat	A Practical Tutorial on Explainable AI Techniques	2021	Adrien Bennetot Ivan Donadello Ayoub El Qadi Mauro Dragoni Thomas Frossard B.J. Wagner Anna Saranti Silvia Tulli Maria Trocan Raja Chatila
+	Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond	2022	Anna Karin Hedström Leander Weber Dilyara Bareeva Daniel Krakowczyk Franz Motzkus Wojciech Samek Sebastian Lapuschkin Marina M. -C. Höhne
+ PDF Chat	A Mechanistic Explanatory Strategy for XAI	2024	Marcin Rabiza
+	Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges	2018	Gabriëlle Ras Marcel van Gerven Pim Haselager
+	Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges	2018	Gabriëlle Ras Marcel van Gerven Pim Haselager
+	Reviewing the Need for Explainable Artificial Intelligence (xAI)	2020	Julie Gerlings Arisa Shollo Ioanna Constantiou

Cited by (60)

Action	Title	Year	Authors
+	Edge Artificial Intelligence for 6G: Vision, Enabling Technologies, and Applications	2021	Khaled B. Letaief Yuanming Shi Jianmin Lu Jianhua Lu
+	Ripple Knowledge Graph Convolutional Networks For Recommendation Systems	2023	Chen Li Yang Cao Ye Zhu Debo Cheng Chengyuan Li Yasuhiko Morimoto
+ PDF Chat	SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects	2021	Oliver T. Unke Stefan Chmiela Michael Gastegger Kristof T. Schütt Huziel E. Sauceda Klaus‐Robert Müller
+	HUDD	2022	Hazem Fahmy Fabrizio Pastore Lionel Briand
+	CLEVR-XAI: A benchmark dataset for the ground truth evaluation of neural network explanations	2021	Leila Arras Ahmed Osman Wojciech Samek
+	Towards Ground Truth Evaluation of Visual Explanations	2020	Ahmed Osman Leila Arras Wojciech Samek
+	Challenges for cognitive decoding using deep learning methods	2021	Armin W. Thomas Christopher Ré Russell A. Poldrack
+	Pull-back Geometry of Persistent Homology Encodings	2023	Shuang Liang Renata Turkeš Jiayi Li Nina Otter Guido Montúfar
+	On the Robustness of Pretraining and Self-Supervision for a Deep Learning-based Analysis of Diabetic Retinopathy	2021	Vignesh Srinivasan Nils Strodthoff Jackie Ma Alexander Binder Klaus‐Robert Müller Wojciech Samek
+	How to Explain Neural Networks: an Approximation Perspective	2021	Hangcheng Dong Bingguo Liu Fengdong Chen Ye Dong Guodong Liu
+	Finding and removing Clever Hans: Using explanation methods to debug and improve deep models	2021	Christopher J. Anders Leander Weber David Neumann Wojciech Samek Klaus‐Robert Müller Sebastian Lapuschkin
+	Explain and improve: LRP-inference fine-tuning for image captioning models	2021	Jiamei Sun Sebastian Lapuschkin Wojciech Samek Alexander Binder
+	Pruning by explaining: A novel criterion for deep neural network pruning	2021	Seul-Ki Yeom Philipp Seegerer Sebastian Lapuschkin Alexander Binder Simon Wiedemann Klaus‐Robert Müller Wojciech Samek
+ PDF Chat	Evaluating deep transfer learning for whole-brain cognitive decoding	2023	Armin W. Thomas Ulman Lindenberger Wojciech Samek Klaus‐Robert Müller
+	Learning domain invariant representations by joint Wasserstein distance minimization	2023	Léo Andéol Yusei Kawakami Yuichiro Wada Takafumi Kanamori Klaus‐Robert Müller Grégoire Montavon
+ PDF Chat	Evaluating explainable artificial intelligence methods for multi-label deep learning classification tasks in remote sensing	2021	Ioannis Kakogeorgiou Κωνσταντίνος Καράντζαλος
+	Deep Learning Methods for Daily Wildfire Danger Forecasting.	2021	Ioannis Prapas Spyros Kondylatos Ioannis Papoutsis Gustau Camps‐Valls Michele Ronco Miguel‐Ángel Fernández‐Torres María Piles Nuno Carvalhais
+	Delivering Inflated Explanations	2023	Yacine Izza Alexey Ignatiev Peter J. Stuckey João Marques‐Silva
+	Regulative development as a model for origin of life and artificial life studies	2022	Chris Fields Michael Levin
+	Explanations Based on Item Response Theory (eXirt): A Model-Specific Method to Explain Tree-Ensemble Model in Trust Perspective	2022	José Ribeiro Lucas Felipe Ferraro Cardoso Raíssa Silva Vitor Cirilo Níkolas Carneiro Ronnie Alves
+ PDF Chat	Inverse Problems for Tumour Growth Models and Neural ODEs	2023	Rym Jaroudi
+	Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks	2023	Lennart Brocki Neo Christopher Chung
+	This looks More Like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation	2022	Srishti Gautam Marina Höhne Stine Thestrup Hansen Robert Jenssen Michael Kampffmeyer
+	Explaining Deep Learning for ECG Analysis: Building Blocks for Auditing and Knowledge Discovery	2023	Patrick Wagner Temesgen Mehari Wilhelm Haverkamp Nils Strodthoff
+	Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning Models	2020	Jiamei Sun Sebastian Lapuschkin Wojciech Samek Alexander Binder
+	Understanding Image Captioning Models beyond Visualizing Attention	2020	Jiamei Sun Sebastian Lapuschkin Wojciech Samek Alexander Binder
+	Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions	2023	Luca Longo Mario Brčić Federico Cabitza Jaesik Choi Roberto Confalonieri Javier Del Ser Riccardo Guidotti Yoichi Hayashi Francisco Herrera Andreas Holzinger
+	PredDiff: Explanations and interactions from conditional expectations	2022	Stefan Blücher Johanna Vielhaben Nils Strodthoff
+ PDF Chat	A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data	2022	Magdalena Wysocka Oskar Wysocki Marie Zufferey Dónal Landers André Freitas
+ PDF Chat	Study of Distractors in Neural Models of Code	2023	Md Rafiqul Islam Rabin Aftab M. Hussain Sahil Suneja Mohammad Amin Alipour
+	Characterization of anomalous diffusion classical statistics powered by deep learning (CONDOR)	2021	Alessia Gentili Giorgio Volpe
+	Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems	2021	John A. Keith Valentín Vassilev-Galindo Bingqing Cheng Stefan Chmiela Michael Gastegger Klaus‐Robert Müller Alexandre Tkatchenko
+	Towards robust explanations for deep neural networks	2021	Ann-Kathrin Dombrowski Christopher J. Anders Klaus‐Robert Müller Pan Kessel
+ PDF Chat	On Tackling Explanation Redundancy in Decision Trees	2022	Yacine Izza Alexey Ignatiev João Marques‐Silva
+ PDF Chat	Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	2023	Yifei Zhang Neng Gao Cunqing Ma
+	Learning Domain Invariant Representations by Joint Wasserstein Distance Minimization	2021	Léo Andéol Yusei Kawakami Yuichiro Wada Takafumi Kanamori Klaus‐Robert Müller Grégoire Montavon
+	Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks	2023	Lorenz Linhardt Klaus‐Robert Müller Grégoire Montavon
+ PDF Chat	A Survey on Multi-Objective Based Parameter Optimization for Deep Learning	2023	Mrittika Chakraborty Wreetbhas Pal Sanghamitra Bandyopadhyay Ujjwal Maulik
+	ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs	2021	Daniel Becking Maximilian Dreyer Wojciech Samek Karsten Müller Sebastian Lapuschkin
+ PDF Chat	NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights	2022	Kirill Bykov Anna Karin Hedström Shinichi Nakajima Marina Höhne

Citing (136)

Action	Title	Year	Authors
+	The Taylor Decomposition: A Unified Generalization of the Oaxaca Method to Nonlinear Models	2013	Stephen Bazen Xavier Joutard
+	Very Deep Convolutional Networks for Large-Scale Image Recognition	2014	Karen Simonyan Andrew Zisserman
+ PDF Chat	Deep neural networks are easily fooled: High confidence predictions for unrecognizable images	2015	Anh‐Tu Nguyen Jason Yosinski Jeff Clune
+	Explaining and Harnessing Adversarial Examples	2014	Ian Goodfellow Jonathon Shlens Christian Szegedy
+ PDF Chat	Supervised group Lasso with applications to microarray data analysis	2007	Shuangge Ma Xiao Song Jian Huang
+	On the interpretation of weight vectors of linear models in multivariate neuroimaging	2013	Stefan Haufe Frank C. Meinecke Kai Görgen Sven Dähne John–Dylan Haynes Benjamin Blankertz Felix Bießmann
+	The Detection of Errors in Multivariate Data Using Principal Components	1974	Douglas M. Hawkins
+ PDF Chat	Deep learning in neural networks: An overview	2014	Jürgen Schmidhuber
+ PDF Chat	Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation	2014	Charlotte Soneson Sarah Gerster Mauro Delorenzi
+ PDF Chat	ImageNet Large Scale Visual Recognition Challenge	2015	Olga Russakovsky Jia Deng Hao Su Jonathan Krause Sanjeev Satheesh Sean Ma Zhiheng Huang Andrej Karpathy Aditya Khosla Michael S. Bernstein
+	Striving for Simplicity: The All Convolutional Net	2014	Jost Tobias Springenberg Alexey Dosovitskiy Thomas Brox Martin Riedmiller
+	Some methods for classification and analysis of multivariate observations	1967	James B. MacQueen
+	Neural Machine Translation by Jointly Learning to Align and Translate	2014	Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio
+ PDF Chat	Invited Commentary: Causal Diagrams and Measurement Bias	2009	Miguel A. Hernán Stephen R. Cole
+	How to Explain Individual Classification Decisions	2009	David Baehrens Timon Schroeter Stefan Harmeling Motoaki Kawanabe Katja Hansen Klaus‐Robert Müller
+	On the Number of Linear Regions of Deep Neural Networks	2014	Guido Montúfar Razvan Pascanu Kyunghyun Cho Yoshua Bengio
+ PDF Chat	Deep Residual Learning for Image Recognition	2016	Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun
+	Explaining nonlinear classification decisions with deep Taylor decomposition	2016	Grégoire Montavon Sebastian Lapuschkin Alexander Binder Wojciech Samek Klaus‐Robert Müller
+ PDF Chat	Evaluating the Visualization of What a Deep Neural Network Has Learned	2016	Wojciech Samek Alexander Binder Grégoire Montavon Sebastian Lapuschkin Klaus‐Robert Müller
+	Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks	2016	Anh Mai Nguyen Jason Yosinski Jeff Clune
+ PDF Chat	Classifying and segmenting microscopy images with deep multiple instance learning	2016	Oren Kraus Jimmy Ba Brendan J. Frey
+	SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size	2016	Forrest Iandola Song Han Matthew W. Moskewicz Khalid Ashraf William J. Dally Kurt Keutzer
+ PDF Chat	Learning Deep Features for Discriminative Localization	2016	Bolei Zhou Aditya Khosla Àgata Lapedriza Aude Oliva Antonio Torralba
+ PDF Chat	Generating Visual Explanations	2016	Lisa Anne Hendricks Zeynep Akata Marcus Rohrbach Jeff Donahue Bernt Schiele Trevor Darrell
+	Not Just a Black Box: Learning Important Features Through Propagating Activation Differences	2016	Avanti Shrikumar Peyton Greenside А.В. Щербина Anshul Kundaje
+ PDF Chat	European Union Regulations on Algorithmic Decision Making and a “Right to Explanation”	2017	Bryce Goodman Seth Flaxman
+ PDF Chat	Quantum-chemical insights from deep tensor neural networks	2017	Kristof T. Schütt Farhad Arbabzadah Stefan Chmiela K. Müller Alexandre Tkatchenko
+ PDF Chat	"What is relevant in a text document?": An interpretable machine learning approach	2017	Leila Arras Franziska Horn Grégoire Montavon Klaus‐Robert Müller Wojciech Samek
+	Visualizing Deep Neural Network Decisions: Prediction Difference Analysis	2017	Luisa Zintgraf Taco Cohen Tameem Adel Max Welling
+	A Roadmap for a Rigorous Science of Interpretability.	2017	Finale Doshi‐Velez Been Kim
+	Towards A Rigorous Science of Interpretable Machine Learning	2017	Finale Doshi‐Velez Been Kim
+	Axiomatic Attribution for Deep Networks	2017	Mukund Sundararajan Ankur Taly Qiqi Yan
+ PDF Chat	Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks	2017	Guy Katz Clark Barrett David L. Dill Kyle D. Julian Mykel J. Kochenderfer
+	Neural Collaborative Filtering	2017	Xiangnan He Lizi Liao Hanwang Zhang Liqiang Nie Xia Hu Tat‐Seng Chua
+	Learning Important Features Through Propagating Activation Differences	2017	Avanti Shrikumar Peyton Greenside Anshul Kundaje
+	SmoothGrad: removing noise by adding noise	2017	Daniel Smilkov Nikhil Thorat Been Kim Fernanda Viégas Martin Wattenberg
+	Methods for interpreting and understanding deep neural networks	2017	Grégoire Montavon Wojciech Samek Klaus‐Robert Müller
+	Learning how to explain neural networks: PatternNet and PatternAttribution	2017	Pieter Jan Kindermans Kristof T. Schütt Maximilian Alber K. Müller Dumitru Erhan Been Kim Sven Dähne
+ PDF Chat	Top-Down Neural Attention by Excitation Backprop	2017	Jianming Zhang Sarah Adel Bargal Zhe Lin Jonathan Brandt Xiaohui Shen Stan Sclaroff
+	How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation.	2018	Menaka Narayanan Emily Chen Jeffrey He Been Kim Sam Gershman Finale Doshi‐Velez