Ask a Question

Prefer a chat interface with context about you and your work?

PreViTS: Contrastive Pretraining with Video Tracking Supervision

PreViTS: Contrastive Pretraining with Video Tracking Supervision

Videos are a rich source for self-supervised learning (SSL) of visual representations due to the presence of natural temporal transformations of objects. However, current methods typically randomly sample video clips for learning, which results in an imperfect supervisory signal. In this work, we propose PreViTS, an SSL framework that utilizes …