Ask a Question

Prefer a chat interface with context about you and your work?

DOrA: 3D Visual Grounding with Order-Aware Referring

DOrA: 3D Visual Grounding with Order-Aware Referring

3D visual grounding aims to identify the target object within a 3D point cloud scene referred to by a natural language description. While previous works attempt to exploit the verbo-visual relation with proposed cross-modal transformers, unstructured natural utterances and scattered objects might lead to undesirable performances. In this paper, we …