Self-Supervised Pre-training with Symmetric Superimposition Modeling for
Scene Text Recognition
Self-Supervised Pre-training with Symmetric Superimposition Modeling for
Scene Text Recognition
In text recognition, self-supervised pre-training emerges as a good solution to reduce dependence on expansive annotated real data. Previous studies primarily focus on local visual representation by leveraging mask image modeling or sequence contrastive learning. However, they omit modeling the linguistic information in text images, which is crucial for recognizing …