RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning
and Manipulation
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning
and Manipulation
A fundamental objective in robot manipulation is to enable models to comprehend visual scenes and execute actions. Although existing robot Multimodal Large Language Models (MLLMs) can handle a range of basic tasks, they still face challenges in two areas: 1) inadequate reasoning ability to tackle complex tasks, and 2) high …