Ask a Question

Prefer a chat interface with context about you and your work?

A Survey on Vision-Language-Action Models for Embodied AI

A Survey on Vision-Language-Action Models for Embodied AI

Deep learning has demonstrated remarkable success across many domains, including computer vision, natural language processing, and reinforcement learning. Representative artificial neural networks in these fields span convolutional neural networks, Transformers, and deep Q-networks. Built upon unimodal neural networks, numerous multi-modal models have been introduced to address a range of tasks …