TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for
Zero-shot Object Navigation
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for
Zero-shot Object Navigation
The Zero-Shot Object Navigation (ZSON) task requires embodied agents to find a previously unseen object by navigating in unfamiliar environments. Such a goal-oriented exploration heavily relies on the ability to perceive, understand, and reason based on the spatial information of the environment. However, current LLM-based approaches convert visual observations to …