Ask a Question

Prefer a chat interface with context about you and your work?

Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities

Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities

This paper explores capabilities of Vision Language Models on spreadsheet comprehension. We propose three self-supervised challenges with corresponding evaluation metrics to comprehensively evaluate VLMs on Optical Character Recognition (OCR), spatial perception, and visual format recognition. Additionally, we utilize the spreadsheet table detection task to assess the overall performance of VLMs …