Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions

Type: Article

Publication Date: 2018-01-01

Citations: 54

DOI: https://doi.org/10.18653/v1/d18-1164

Abstract

In Visual Question Answering, most existing approaches adopt the pipeline of representing an image via pre-trained CNNs, and then using the uninterpretable CNN features in conjunction with the question to predict the answer. Although such end-to-end models might report promising performance, they rarely provide any insight, apart from the answer, into the VQA process. In this work, we propose to break up the end-to-end VQA into two steps: explaining and reasoning, in an attempt towards a more explainable VQA by shedding light on the intermediate results between these two steps. To that end, we first extract attributes and generate descriptions as explanations for an image. Next, a reasoning module utilizes these explanations in place of the image to infer an answer. The advantages of such a breakdown include: (1) the attributes and captions can reflect what the system extracts from the image, thus can provide some insights for the predicted answer; (2) these intermediate results can help identify the inabilities of the image understanding or the answer inference part when the predicted answer is wrong. We conduct extensive experiments on a popular VQA dataset and our system achieves comparable performance with the baselines, yet with added benefits of explanability and the inherent ability to further improve with higher quality explanations.

Locations

  • arXiv (Cornell University) - View - PDF
  • Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing - View - PDF

Similar Works

Action Title Year Authors
+ VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions 2018 Qing Li
Qingyi Tao
Shafiq Joty
Jianfei Cai
Jiebo Luo
+ PDF Chat Coarse-to-Fine Reasoning for Visual Question Answering 2022 Binh X. Nguyen
Tuong Do
Huy Dat Tran
Erman Tjiputra
Quang D. Tran
Anh Nguyen
+ Coarse-to-Fine Reasoning for Visual Question Answering 2021 Binh X. Nguyen
Tuong Do
Huy Tran
Erman Tjiputra
Quang D. Tran
Anh Nguyen
+ Answer Them All! Toward Universal Visual Question Answering Models 2019 Robik Shrestha
Kushal Kafle
Christopher Kanan
+ PDF Chat Answer Them All! Toward Universal Visual Question Answering Models 2019 Robik Shrestha
Kushal Kafle
Christopher Kanan
+ Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering 2017 Vahid Kazemi
Ali Elqursh
+ PDF Chat Convincing Rationales for Visual Question Answering Reasoning 2024 Kun Li
George Vosselman
Michael Ying Yang
+ Image Captioning and Visual Question Answering Based on Attributes and Their Related External Knowledge. 2016 Qi Wu
Chunhua Shen
Anton van den Hengel
Peng Wang
Anthony Dick
+ Faithful Multimodal Explanation for Visual Question Answering 2019 Jialin Wu
Raymond J. Mooney
+ Visual Question Answering: A Survey of Methods and Datasets 2016 Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
Anthony Dick
Anton van den Hengel
+ Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions 2020 Radhika Dua
Sai Srinivas Kancheti
Vineeth N Balasubramanian
+ Image Captioning and Visual Question Answering Based on Attributes and External Knowledge 2016 Qi Wu
Chunhua Shen
Anton van den Hengel
Peng Wang
Anthony Dick
+ Faithful Multimodal Explanation for Visual Question Answering 2018 Jialin Wu
Raymond J. Mooney
+ Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions. 2020 Radhika Dua
Sai Srinivas Kancheti
Vineeth N Balasubramanian
+ PDF Chat Image Captioning and Visual Question Answering Based on Attributes and External Knowledge 2017 Qi Wu
Chunhua Shen
Peng Wang
Anthony Dick
Anton van den Hengel
+ PDF Chat Interpretable Visual Question Answering by Visual Grounding From Attention Supervision Mining 2019 Yundong Zhang
Juan Carlos Niebles
Álvaro Soto
+ Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining 2018 Yundong Zhang
Juan Carlos Niebles
Álvaro Soto
+ VQA: Visual Question Answering 2015 Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. Lawrence Zitnick
Dhruv Batra
Devi Parikh
+ Self-Critical Reasoning for Robust Visual Question Answering 2019 Jialin Wu
Raymond J. Mooney
+ PDF Chat LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering 2020 Weixin Liang
Feiyang Niu
Aishwarya Reganti
Govind Thattai
Gökhan TĂŒr

Works That Cite This (27)

Action Title Year Authors
+ PDF Chat Visual question answering based on local-scene-aware referring expression generation 2021 Jung-Jun Kim
Dong-Gyu Lee
Jialin Wu
Hong-Gyu Jung
Seong‐Whan Lee
+ PDF Chat A-OKVQA: A Benchmark for Visual Question Answering Using World Knowledge 2022 Dustin Schwenk
Apoorv Khandelwal
Christopher Clark
Kenneth Marino
Roozbeh Mottaghi
+ Visual Question Answering based on Local-Scene-Aware Referring Expression Generation 2021 Jungjun Kim
Dong-Gyu Lee
Jialin Wu
Hong-Gyu Jung
Seong‐Whan Lee
+ LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering 2020 Weixin Liang
Feiyang Niu
Aishwarya Reganti
Govind Thattai
Gökhan TĂŒr
+ QA2Explanation: Generating and Evaluating Explanations for Question Answering Systems over Knowledge Graph 2020 Saeedeh Shekarpour
Abhishek Nadgeri
Kuldeep Singh
+ PDF Chat SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions 2020 Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco TĂșlio Ribeiro
Besmira Nushi
Ece Kamar
+ Relation-Aware Graph Attention Network for Visual Question Answering 2019 Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
+ PDF Chat Relation-Aware Graph Attention Network for Visual Question Answering 2019 Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
+ SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions 2020 Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco Ribeiro
Besmira Nushi
Ece Kamar
+ Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning 2021 Zhicheng Huang
Zhaoyang Zeng
Yupan Huang
Bei Liu
Dongmei Fu
Jianlong Fu