Foundation Models for Video Understanding: A Survey
Foundation Models for Video Understanding: A Survey
Video Foundation Models (ViFMs) aim to learn a general-purpose representation for various video understanding tasks.Leveraging large-scale datasets and powerful models, ViFMs achieve this by capturing robust and generic features from video data.This survey analyzes over 200 video foundational models, offering a comprehensive overview of benchmarks and evaluation metrics across 14 …