Ask a Question

Prefer a chat interface with context about you and your work?

Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation

Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation

Visual Question Answering on 3D Point Cloud (VQA-3D) is an emerging yet challenging field that aims at answering various types of textual questions given an entire point cloud scene. To tackle this problem, we propose the CLEVR3D, a large-scale VQA-3D dataset consisting of 171K questions from 8,771 3D scenes. Specifically, …