Ask a Question

Prefer a chat interface with context about you and your work?

Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

Existing Scene Text Recognition (STR) methods typically use a language model to optimize the joint probability of the 1D character sequence predicted by a visual recognition (VR) model, which ignore the 2D spatial context of visual semantics within and between character instances, making them not generalize well to arbitrary shape …