Ask a Question

Prefer a chat interface with context about you and your work?

An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA

An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA

Knowledge-based visual question answering (VQA) involves answering questions that require external knowledge not present in the image. Existing methods first retrieve knowledge from external resources, then reason over the selected knowledge, the input image, and question for answer prediction. However, this two-step approach could lead to mismatches that potentially limit …