Vision Language Models for Spreadsheet Understanding: Challenges and
Opportunities
Vision Language Models for Spreadsheet Understanding: Challenges and
Opportunities
This paper explores capabilities of Vision Language Models on spreadsheet comprehension. We propose three self-supervised challenges with corresponding evaluation metrics to comprehensively evaluate VLMs on Optical Character Recognition (OCR), spatial perception, and visual format recognition. Additionally, we utilize the spreadsheet table detection task to assess the overall performance of VLMs …