Ask a Question

Prefer a chat interface with context about you and your work?

Tell Me What's Next: Textual Foresight for Generic UI Representations

Tell Me What's Next: Textual Foresight for Generic UI Representations

Mobile app user interfaces (UIs) are rich with action, text, structure, and image content that can be utilized to learn generic UI representations for tasks like automating user commands, summarizing content, and evaluating the accessibility of user interfaces. Prior work has learned strong visual representations with local or global captioning …