Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications

Type: Review

Publication Date: 2021-03-01

Citations: 148

DOI: https://doi.org/10.1109/jproc.2021.3060483

View Chat PDF

Abstract

With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for Explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning, in particular, deep neural networks, are therefore receiving increased attention. In this work we aim to (1) provide a timely overview of this active emerging field, with a focus on 'post-hoc' explanations, and explain its theoretical foundations, (2) put interpretability algorithms to a test both from a theory and comparative evaluation perspective using extensive simulations, (3) outline best practice aspects i.e. how to best include interpretation methods into the standard usage of machine learning and (4) demonstrate successful usage of explainable AI in a representative selection of application scenarios. Finally, we discuss challenges and possible future directions of this exciting foundational field of machine learning.

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View
  • Proceedings of the IEEE - View - PDF

Similar Works

Action Title Year Authors
+ Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond 2020 Wojciech Samek
Grégoire Montavon
Sebastian Lapuschkin
Christopher J. Anders
Klaus‐Robert MĂŒller
+ Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey 2020 Arun Das
Paul Rad
+ Hide-and-Seek: A Template for Explainable AI 2020 Thanos Tagaris
Andreas Stafylopatis
+ PDF Chat Explaining Deep Neural Networks by Leveraging Intrinsic Methods 2024 Biagio La Rosa
+ PDF Chat Explaining Explanations: An Overview of Interpretability of Machine Learning 2018 Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael A. Specter
Lalana Kagal
+ Explaining Explanations: An Overview of Interpretability of Machine Learning 2018 Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael Specter
Lalana Kagal
+ Explaining Explanations: An Overview of Interpretability of Machine Learning 2018 Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael A. Specter
Lalana Kagal
+ PDF Chat Explainable Deep Learning: A Field Guide for the Uninitiated 2022 Gabriëlle Ras
Ning Xie
Marcel van Gerven
Derek Doran
+ Explainable Artificial Intelligence: a Systematic Review 2020 Giulia Vilone
Luca Longo
+ Explainable Artificial Intelligence: a Systematic Review 2020 Giulia Vilone
Luca Longo
+ Deep Learning Reproducibility and Explainable AI (XAI) 2022 Anastasia Leventi-Peetz
T. Östreich
+ Explainable Deep Learning: A Field Guide for the Uninitiated 2020 Gabriëlle Ras
Ning Xie
Marcel van Gerven
Derek Doran
+ Explaining Explanations: An Approach to Evaluating Interpretability of Machine Learning 2018 Leilani H. Gilpin
David Bau
Ben Z. Yuan
Ayesha Bajwa
Michael A. Specter
Lalana Kagal
+ Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning 2022 Yuyang Gao
Siyi Gu
Junji Jiang
Sung-Soo Hong
Dazhou Yu
Liang Zhao
+ PDF Chat A Practical Tutorial on Explainable AI Techniques 2021 Adrien Bennetot
Ivan Donadello
Ayoub El Qadi
Mauro Dragoni
Thomas Frossard
B.J. Wagner
Anna Saranti
Silvia Tulli
Maria Trocan
Raja Chatila
+ Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations and Beyond 2022 Anna Karin Hedström
Leander Weber
Dilyara Bareeva
Daniel Krakowczyk
Franz Motzkus
Wojciech Samek
Sebastian Lapuschkin
Marina M. -C. Höhne
+ PDF Chat A Mechanistic Explanatory Strategy for XAI 2024 Marcin Rabiza
+ Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges 2018 Gabriëlle Ras
Marcel van Gerven
Pim Haselager
+ Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges 2018 Gabriëlle Ras
Marcel van Gerven
Pim Haselager
+ Reviewing the Need for Explainable Artificial Intelligence (xAI) 2020 Julie Gerlings
Arisa Shollo
Ioanna Constantiou

Cited by (60)

Action Title Year Authors
+ Edge Artificial Intelligence for 6G: Vision, Enabling Technologies, and Applications 2021 Khaled B. Letaief
Yuanming Shi
Jianmin Lu
Jianhua Lu
+ Ripple Knowledge Graph Convolutional Networks For Recommendation Systems 2023 Chen Li
Yang Cao
Ye Zhu
Debo Cheng
Chengyuan Li
Yasuhiko Morimoto
+ PDF Chat SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects 2021 Oliver T. Unke
Stefan Chmiela
Michael Gastegger
Kristof T. SchĂŒtt
Huziel E. Sauceda
Klaus‐Robert MĂŒller
+ HUDD 2022 Hazem Fahmy
Fabrizio Pastore
Lionel Briand
+ CLEVR-XAI: A benchmark dataset for the ground truth evaluation of neural network explanations 2021 Leila Arras
Ahmed Osman
Wojciech Samek
+ Towards Ground Truth Evaluation of Visual Explanations 2020 Ahmed Osman
Leila Arras
Wojciech Samek
+ Challenges for cognitive decoding using deep learning methods 2021 Armin W. Thomas
Christopher RĂ©
Russell A. Poldrack
+ Pull-back Geometry of Persistent Homology Encodings 2023 Shuang Liang
Renata TurkeĆĄ
Jiayi Li
Nina Otter
Guido MontĂșfar
+ On the Robustness of Pretraining and Self-Supervision for a Deep Learning-based Analysis of Diabetic Retinopathy 2021 Vignesh Srinivasan
Nils Strodthoff
Jackie Ma
Alexander Binder
Klaus‐Robert MĂŒller
Wojciech Samek
+ How to Explain Neural Networks: an Approximation Perspective 2021 Hangcheng Dong
Bingguo Liu
Fengdong Chen
Ye Dong
Guodong Liu
+ Finding and removing Clever Hans: Using explanation methods to debug and improve deep models 2021 Christopher J. Anders
Leander Weber
David Neumann
Wojciech Samek
Klaus‐Robert MĂŒller
Sebastian Lapuschkin
+ Explain and improve: LRP-inference fine-tuning for image captioning models 2021 Jiamei Sun
Sebastian Lapuschkin
Wojciech Samek
Alexander Binder
+ Pruning by explaining: A novel criterion for deep neural network pruning 2021 Seul-Ki Yeom
Philipp Seegerer
Sebastian Lapuschkin
Alexander Binder
Simon Wiedemann
Klaus‐Robert MĂŒller
Wojciech Samek
+ PDF Chat Evaluating deep transfer learning for whole-brain cognitive decoding 2023 Armin W. Thomas
Ulman Lindenberger
Wojciech Samek
Klaus‐Robert MĂŒller
+ Learning domain invariant representations by joint Wasserstein distance minimization 2023 Léo Andéol
Yusei Kawakami
Yuichiro Wada
Takafumi Kanamori
Klaus‐Robert MĂŒller
Grégoire Montavon
+ PDF Chat Evaluating explainable artificial intelligence methods for multi-label deep learning classification tasks in remote sensing 2021 Ioannis Kakogeorgiou
ÎšÏ‰ÎœÏƒÏ„Î±ÎœÏ„ÎŻÎœÎżÏ‚ ÎšÎ±ÏÎŹÎœÏ„Î¶Î±Î»ÎżÏ‚
+ Deep Learning Methods for Daily Wildfire Danger Forecasting. 2021 Ioannis Prapas
Spyros Kondylatos
Ioannis Papoutsis
Gustau Camps‐Valls
Michele Ronco
Miguel‐Ángel Fernández‐Torres
MarĂ­a Piles
Nuno Carvalhais
+ Delivering Inflated Explanations 2023 Yacine Izza
Alexey Ignatiev
Peter J. Stuckey
João Marques‐Silva
+ Regulative development as a model for origin of life and artificial life studies 2022 Chris Fields
Michael Levin
+ Explanations Based on Item Response Theory (eXirt): A Model-Specific Method to Explain Tree-Ensemble Model in Trust Perspective 2022 José Ribeiro
Lucas Felipe Ferraro Cardoso
RaĂ­ssa Silva
Vitor Cirilo
NĂ­kolas Carneiro
Ronnie Alves
+ PDF Chat Inverse Problems for Tumour Growth Models and Neural ODEs 2023 Rym Jaroudi
+ Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks 2023 Lennart Brocki
Neo Christopher Chung
+ This looks More Like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation 2022 Srishti Gautam
Marina Höhne
Stine Thestrup Hansen
Robert Jenssen
Michael Kampffmeyer
+ Explaining Deep Learning for ECG Analysis: Building Blocks for Auditing and Knowledge Discovery 2023 Patrick Wagner
Temesgen Mehari
Wilhelm Haverkamp
Nils Strodthoff
+ Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning Models 2020 Jiamei Sun
Sebastian Lapuschkin
Wojciech Samek
Alexander Binder
+ Understanding Image Captioning Models beyond Visualizing Attention 2020 Jiamei Sun
Sebastian Lapuschkin
Wojciech Samek
Alexander Binder
+ Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions 2023 Luca Longo
Mario Brčić
Federico Cabitza
Jaesik Choi
Roberto Confalonieri
Javier Del Ser
Riccardo Guidotti
Yoichi Hayashi
Francisco Herrera
Andreas Holzinger
+ PredDiff: Explanations and interactions from conditional expectations 2022 Stefan BlĂŒcher
Johanna Vielhaben
Nils Strodthoff
+ PDF Chat A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data 2022 Magdalena Wysocka
Oskar Wysocki
Marie Zufferey
DĂłnal Landers
André Freitas
+ PDF Chat Study of Distractors in Neural Models of Code 2023 Md Rafiqul Islam Rabin
Aftab M. Hussain
Sahil Suneja
Mohammad Amin Alipour
+ Characterization of anomalous diffusion classical statistics powered by deep learning (CONDOR) 2021 Alessia Gentili
Giorgio Volpe
+ Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems 2021 John A. Keith
ValentĂ­n Vassilev-Galindo
Bingqing Cheng
Stefan Chmiela
Michael Gastegger
Klaus‐Robert MĂŒller
Alexandre Tkatchenko
+ Towards robust explanations for deep neural networks 2021 Ann-Kathrin Dombrowski
Christopher J. Anders
Klaus‐Robert MĂŒller
Pan Kessel
+ PDF Chat On Tackling Explanation Redundancy in Decision Trees 2022 Yacine Izza
Alexey Ignatiev
João Marques‐Silva
+ PDF Chat Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling 2023 Yifei Zhang
Neng Gao
Cunqing Ma
+ Learning Domain Invariant Representations by Joint Wasserstein Distance Minimization 2021 Léo Andéol
Yusei Kawakami
Yuichiro Wada
Takafumi Kanamori
Klaus‐Robert MĂŒller
Grégoire Montavon
+ Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks 2023 Lorenz Linhardt
Klaus‐Robert MĂŒller
Grégoire Montavon
+ PDF Chat A Survey on Multi-Objective Based Parameter Optimization for Deep Learning 2023 Mrittika Chakraborty
Wreetbhas Pal
Sanghamitra Bandyopadhyay
Ujjwal Maulik
+ ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs 2021 Daniel Becking
Maximilian Dreyer
Wojciech Samek
Karsten MĂŒller
Sebastian Lapuschkin
+ PDF Chat NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights 2022 Kirill Bykov
Anna Karin Hedström
Shinichi Nakajima
Marina Höhne

Citing (136)

Action Title Year Authors
+ The Taylor Decomposition: A Unified Generalization of the Oaxaca Method to Nonlinear Models 2013 Stephen Bazen
Xavier Joutard
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
+ PDF Chat Deep neural networks are easily fooled: High confidence predictions for unrecognizable images 2015 Anh‐Tu Nguyen
Jason Yosinski
Jeff Clune
+ Explaining and Harnessing Adversarial Examples 2014 Ian Goodfellow
Jonathon Shlens
Christian Szegedy
+ PDF Chat Supervised group Lasso with applications to microarray data analysis 2007 Shuangge Ma
Xiao Song
Jian Huang
+ On the interpretation of weight vectors of linear models in multivariate neuroimaging 2013 Stefan Haufe
Frank C. Meinecke
Kai Görgen
Sven DĂ€hne
John­–Dylan Haynes
Benjamin Blankertz
Felix Bießmann
+ The Detection of Errors in Multivariate Data Using Principal Components 1974 Douglas M. Hawkins
+ PDF Chat Deep learning in neural networks: An overview 2014 JĂŒrgen Schmidhuber
+ PDF Chat Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation 2014 Charlotte Soneson
Sarah Gerster
Mauro Delorenzi
+ PDF Chat ImageNet Large Scale Visual Recognition Challenge 2015 Olga Russakovsky
Jia Deng
Hao Su
Jonathan Krause
Sanjeev Satheesh
Sean Ma
Zhiheng Huang
Andrej Karpathy
Aditya Khosla
Michael S. Bernstein
+ Striving for Simplicity: The All Convolutional Net 2014 Jost Tobias Springenberg
Alexey Dosovitskiy
Thomas Brox
Martin Riedmiller
+ Some methods for classification and analysis of multivariate observations 1967 James B. MacQueen
+ Neural Machine Translation by Jointly Learning to Align and Translate 2014 Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
+ PDF Chat Invited Commentary: Causal Diagrams and Measurement Bias 2009 Miguel A. HernĂĄn
Stephen R. Cole
+ How to Explain Individual Classification Decisions 2009 David Baehrens
Timon Schroeter
Stefan Harmeling
Motoaki Kawanabe
Katja Hansen
Klaus‐Robert MĂŒller
+ On the Number of Linear Regions of Deep Neural Networks 2014 Guido MontĂșfar
Razvan Pascanu
Kyunghyun Cho
Yoshua Bengio
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
+ Explaining nonlinear classification decisions with deep Taylor decomposition 2016 Grégoire Montavon
Sebastian Lapuschkin
Alexander Binder
Wojciech Samek
Klaus‐Robert MĂŒller
+ PDF Chat Evaluating the Visualization of What a Deep Neural Network Has Learned 2016 Wojciech Samek
Alexander Binder
Grégoire Montavon
Sebastian Lapuschkin
Klaus‐Robert MĂŒller
+ Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks 2016 Anh Mai Nguyen
Jason Yosinski
Jeff Clune
+ PDF Chat Classifying and segmenting microscopy images with deep multiple instance learning 2016 Oren Kraus
Jimmy Ba
Brendan J. Frey
+ SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size 2016 Forrest Iandola
Song Han
Matthew W. Moskewicz
Khalid Ashraf
William J. Dally
Kurt Keutzer
+ PDF Chat Learning Deep Features for Discriminative Localization 2016 Bolei Zhou
Aditya Khosla
Àgata Lapedriza
Aude Oliva
Antonio Torralba
+ PDF Chat Generating Visual Explanations 2016 Lisa Anne Hendricks
Zeynep Akata
Marcus Rohrbach
Jeff Donahue
Bernt Schiele
Trevor Darrell
+ Not Just a Black Box: Learning Important Features Through Propagating Activation Differences 2016 Avanti Shrikumar
Peyton Greenside
А.В. Đ©Đ”Ń€Đ±ĐžĐœĐ°
Anshul Kundaje
+ PDF Chat European Union Regulations on Algorithmic Decision Making and a “Right to Explanation” 2017 Bryce Goodman
Seth Flaxman
+ PDF Chat Quantum-chemical insights from deep tensor neural networks 2017 Kristof T. SchĂŒtt
Farhad Arbabzadah
Stefan Chmiela
K. MĂŒller
Alexandre Tkatchenko
+ PDF Chat "What is relevant in a text document?": An interpretable machine learning approach 2017 Leila Arras
Franziska Horn
Grégoire Montavon
Klaus‐Robert MĂŒller
Wojciech Samek
+ Visualizing Deep Neural Network Decisions: Prediction Difference Analysis 2017 Luisa Zintgraf
Taco Cohen
Tameem Adel
Max Welling
+ A Roadmap for a Rigorous Science of Interpretability. 2017 Finale Doshi‐Velez
Been Kim
+ Towards A Rigorous Science of Interpretable Machine Learning 2017 Finale Doshi‐Velez
Been Kim
+ Axiomatic Attribution for Deep Networks 2017 Mukund Sundararajan
Ankur Taly
Qiqi Yan
+ PDF Chat Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks 2017 Guy Katz
Clark Barrett
David L. Dill
Kyle D. Julian
Mykel J. Kochenderfer
+ Neural Collaborative Filtering 2017 Xiangnan He
Lizi Liao
Hanwang Zhang
Liqiang Nie
Xia Hu
Tat‐Seng Chua
+ Learning Important Features Through Propagating Activation Differences 2017 Avanti Shrikumar
Peyton Greenside
Anshul Kundaje
+ SmoothGrad: removing noise by adding noise 2017 Daniel Smilkov
Nikhil Thorat
Been Kim
Fernanda Viégas
Martin Wattenberg
+ Methods for interpreting and understanding deep neural networks 2017 Grégoire Montavon
Wojciech Samek
Klaus‐Robert MĂŒller
+ Learning how to explain neural networks: PatternNet and PatternAttribution 2017 Pieter Jan Kindermans
Kristof T. SchĂŒtt
Maximilian Alber
K. MĂŒller
Dumitru Erhan
Been Kim
Sven DĂ€hne
+ PDF Chat Top-Down Neural Attention by Excitation Backprop 2017 Jianming Zhang
Sarah Adel Bargal
Zhe Lin
Jonathan Brandt
Xiaohui Shen
Stan Sclaroff
+ How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. 2018 Menaka Narayanan
Emily Chen
Jeffrey He
Been Kim
Sam Gershman
Finale Doshi‐Velez