GeoX: Geometric Problem Solving Through Unified Formalized
Vision-Language Pre-training
GeoX: Geometric Problem Solving Through Unified Formalized
Vision-Language Pre-training
Despite their proficiency in general tasks, Multi-modal Large Language Models (MLLMs) struggle with automatic Geometry Problem Solving (GPS), which demands understanding diagrams, interpreting symbols, and performing complex reasoning. This limitation arises from their pre-training on natural images and texts, along with the lack of automated verification in the problem-solving process. …