Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents
Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents
The perception system in personalized mobile agents requires developing indoor scene understanding models, which can understand 3D geometries, capture objectiveness, analyze human behaviors, etc. Nonetheless, this direction has not been well-explored in comparison with models for outdoor environments (e.g., the autonomous driving system that includes pedestrian prediction, car detection, traffic …