Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information
Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information
To effectively exploit the potential of large-scale models, various pre-training strategies supported by massive data from different sources are proposed, including supervised pre-training, weakly-supervised pre-training, and self-supervised pre-training. It has been proved that combining multiple pre-training strategies and data from various modalities/sources can greatly boost the training of large-scale models. …