Ask a Question

Prefer a chat interface with context about you and your work?

Spatial Knowledge Distillation to Aid Visual Reasoning

Spatial Knowledge Distillation to Aid Visual Reasoning

For tasks involving language and vision, the current state-of-the-art methods tend not to leverage any additional information that might be present to gather relevant (commonsense) knowledge. A representative task is Visual Question Answering where large diagnostic datasets have been proposed to test a system's capability of answering questions about images. …