PreViTS: Contrastive Pretraining with Video Tracking Supervision
PreViTS: Contrastive Pretraining with Video Tracking Supervision
Videos are a rich source for self-supervised learning (SSL) of visual representations due to the presence of natural temporal transformations of objects. However, current methods typically randomly sample video clips for learning, which results in an imperfect supervisory signal. In this work, we propose PreViTS, an SSL framework that utilizes …