Ask a Question

Prefer a chat interface with context about you and your work?

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions

In Visual Question Answering, most existing approaches adopt the pipeline of representing an image via pre-trained CNNs, and then using the uninterpretable CNN features in conjunction with the question to predict the answer. Although such end-to-end models might report promising performance, they rarely provide any insight, apart from the answer, …