Ask a Question

Prefer a chat interface with context about you and your work?

Spatial-then-Temporal Self-Supervised Learning for Video Correspondence

Spatial-then-Temporal Self-Supervised Learning for Video Correspondence

In low-level video analyses, effective representations are important to derive the correspondences between video frames. These representations have been learned in a selfsupervised fashion from unlabeled images or videos, using carefully designed pretext tasks in some recent studies. However, the previous work concentrates on either spatial-discriminative features or temporal-repetitive features, …