Ask a Question

Prefer a chat interface with context about you and your work?

Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency

Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency

Natural videos provide rich visual contents for selfsupervised learning. Yet most existing approaches for learning spatio-temporal representations rely on manually trimmed videos, leading to limited diversity in visual patterns and limited performance gain. In this work, we aim to learn representations by leveraging more abundant information in untrimmed videos. To …