Ask a Question

Prefer a chat interface with context about you and your work?

Self-supervised Pretraining and Finetuning for Monocular Depth and Visual Odometry

Self-supervised Pretraining and Finetuning for Monocular Depth and Visual Odometry

For the task of simultaneous monocular depth and visual odometry estimation, we propose learning self-supervised transformer-based models in two steps. Our first step consists in a generic pretraining to learn 3D geometry, using cross-view completion objective (CroCo), followed by self-supervised finetuning on non-annotated videos. We show that our self-supervised models …