FlowVQA: Mapping Multimodal Logic in Visual Question Answering with
Flowcharts
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with
Flowcharts
Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart …