Instance-free Text to Point Cloud Localization with Relative Position
Awareness
Instance-free Text to Point Cloud Localization with Relative Position
Awareness
Text-to-point-cloud cross-modal localization is an emerging vision-language task critical for future robot-human collaboration. It seeks to localize a position from a city-scale point cloud scene based on a few natural language instructions. In this paper, we address two key limitations of existing approaches: 1) their reliance on ground-truth instances as …