Understanding mobile GUI: From pixel-words to screen-sentences

Type: Article

Publication Date: 2024-07-17

Citations: 0

DOI: https://doi.org/10.1016/j.neucom.2024.128200

Locations

  • arXiv (Cornell University) - View - PDF
  • Neurocomputing - View

Similar Works

Action Title Year Authors
+ Understanding Mobile GUI: from Pixel-Words to Screen-Sentences 2021 Jingwen Fu
Xiaoyi Zhang
Yuwang Wang
Wenjun Zeng
Sam Yang
Grayson Hilliard
+ HelpViz: Automatic Generation of Contextual Visual MobileTutorials from Text-Based Instructions 2021 Mingyuan Zhong
Gang Li
Peggy Chi
Yang Li
+ HelpViz: Automatic Generation of Contextual Visual Mobile Tutorials from Text-Based Instructions 2021 Mingyuan Zhong
Gang Li
Peggy Chi
Yang Li
+ PDF Chat Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding 2024 Yue Fan
Lei Ding
Ching-Chen Kuo
Shan Jiang
Yang Zhao
Xinze Guan
Jie Yang
Yi Zhang
Xin Wang
+ PDF Chat Aurora: Navigating UI Tarpits via Automated Neural Screen Understanding 2024 Safwat Ali Khan
Wenyu Wang
Yiran Ren
Bin Zhu
Jiangfan Shi
Alyssa McGowan
Wing Lam
Kevin Moran
+ PDF Chat ScreenAI: A Vision-Language Model for UI and Infographics Understanding 2024 Gilles Baechler
Srinivas Sunkara
Maria Wang
Fedir Zubach
Hassan Mansoor
Vincent Etter
Victor Cărbune
Jason Lin
Jindong Chen
Abhanshu Sharma
+ PDF Chat ScreenAI: A Vision-Language Model for UI and Infographics Understanding 2024 Gilles Baechler
Srinivas Sunkara
Maria Wang
Fedir Zubach
Hassan Mansoor
Vincent Etter
Victor Cărbune
Jason Lin
Jindong Chen
Abhanshu Sharma
+ Referring to Screen Texts with Voice Assistants 2023 Shruti Bhargava Choubey
Anand Dhoot
Ing‐Marie Jonsson
Long Hoang Nguyen
Alkesh Patel
Hong Yu
Vincent Renkens
+ PDF Chat Vision-driven Automated Mobile GUI Testing via Multimodal Large Language Model 2024 Zhe Liu
Cheng Li
Chunyang Chen
Junjie Wang
Boyu Wu
Yawen Wang
Jun Hu
Qing Wang
+ PDF Chat SmartPhone: Exploring Keyword Mnemonic with Auto-generated Verbal and Visual Cues 2023 Jaewook Lee
Andrew Lan
+ Towards Better Semantic Understanding of Mobile Interfaces 2022 Srinivas Sunkara
Maria Wang
Lijuan Liu
Gilles Baechler
Yu-Chung Hsiao
Jindong
Chen
Abhanshu Sharma
James W. Stout
+ PDF Chat Inferring Alt-text For UI Icons With Large Language Models During App Development 2024 Sabrina Haque
Christoph Csallner
+ Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus 2022 Gang Li
Yang Li
+ Screen Parsing: Towards Reverse Engineering of UI Models from Screenshots 2021 Jason Wu
Xiaoyi Zhang
Jeffrey A. Nichols
Jeffrey P. Bigham
+ PDF Chat Training a Vision Language Model as Smartphone Assistant 2024 Nicolai Dorka
Janusz Marecki
Ammar Anwar
+ PDF Chat Improving Language Understanding from Screenshots 2024 Tianyu Gao
Zi-Rui Wang
Adithya Bhaskar
Danqi Chen
+ Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning 2021 Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
+ PDF Chat MobileFlow: A Multimodal LLM For Mobile GUI Agent 2024 Songqin Nong
Jiali Zhu
Rui Wu
Jiongchao Jin
Shuo Shan
Xiutian Huang
Wenhao Xu
+ Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning 2021 Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
+ PDF Chat Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs 2024 Keen You
Haotian Zhang
Eldon Schoop
Floris Weers
Amanda Swearngin
Jeffrey Nichols
Yinfei Yang
Zhe Gan

Works That Cite This (0)

Action Title Year Authors