A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective

Type: Article

Publication Date: 2024-08-19

Citations: 6

DOI: https://doi.org/10.1109/tpami.2024.3445463

Abstract

Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (e.g., social network analysis and recommender systems), computer vision (e.g., object detection and point cloud learning), and natural language processing (e.g., relation extraction and sequence learning), to name a few. With the emergence of Transformers in natural language processing and computer vision, graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation while avoiding strict structural inductive biases. In this paper, we present a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective. Specifically, we divide their applications in computer vision into five categories according to the modality of input data, i.e., 2D natural images, videos, 3D data, vision + language, and medical images. In each category, we further divide the applications according to a set of vision tasks. Such a task-oriented taxonomy allows us to examine how each task is tackled by different GNN-based approaches and how well these approaches perform. Based on the necessary preliminaries, we provide the definitions and challenges of the tasks, in-depth coverage of the representative approaches, as well as discussions regarding insights, limitations, and future directions.

Locations

  • IEEE Transactions on Pattern Analysis and Machine Intelligence - View
  • arXiv (Cornell University) - View - PDF
  • PubMed - View

Similar Works

Action Title Year Authors
+ Graph-level Neural Networks: Current Progress and Future Directions 2022 Ge Zhang
Jia Wu
Jian Yang
Shan Xue
Wenbin Hu
Chuan Zhou
Hao Peng
Quan Z. Sheng
Charų C. Aggarwal
+ PDF Chat Graph Neural Networks: Taxonomy, Advances, and Trends 2022 Yu Zhou
Haixia Zheng
Xin Huang
Shufeng Hao
Dengao Li
Jumin Zhao
+ Two Stream Scene Understanding on Graph Embedding 2023 Wenkai Yang
Wenyuan Sun
Runxaing Huang
+ PDF Chat GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs 2024 Mustafa Munir
William Avery
Md. Mostafijur Rahman
Radu Mărculescu
+ Graph Neural Networks: Taxonomy, Advances and Trends. 2020 Yu Zhou
Haixia Zheng
Xin Huang
+ Graph Barlow Twins: A self-supervised representation learning framework for graphs 2022 Piotr Bielak
Tomasz Kajdanowicz
Nitesh V. Chawla
+ A Survey on Graph Structure Learning: Progress and Opportunities 2021 Yanqiao Zhu
Weizhi Xu
Jinghao Zhang
Yuanqi Du
Jieyu Zhang
Qiang Liu
Carl Yang
Shu Wu
+ Graph Barlow Twins: A self-supervised representation learning framework for graphs 2021 Piotr Bielak
Tomasz Kajdanowicz
Nitesh V. Chawla
+ PDF Chat Self-Supervised Learning of Graph Neural Networks: A Unified Review 2022 Yaochen Xie
Xu Zhao
Jingtun Zhang
Zhengyang Wang
Shuiwang Ji
+ Self-Supervised Learning of Graph Neural Networks: A Unified Review 2021 Yaochen Xie
Xu Zhao
Jingtun Zhang
Zhengyang Wang
Shuiwang Ji
+ Data-centric Graph Learning: A Survey 2023 Cheng Yang
Deyu Bo
Jixi Liu
Yufei Peng
Boyu Chen
Hao‐Ran Dai
Ao Sun
Yue Yu
Yixin Xiao
Qi Zhang
+ The More You Know: Using Knowledge Graphs for Image Classification 2016 Kenneth Marino
Ruslan Salakhutdinov
Abhinav Gupta
+ The More You Know: Using Knowledge Graphs for Image Classification 2016 Kenneth Marino
Ruslan Salakhutdinov
Abhinav Gupta
+ PDF Chat The More You Know: Using Knowledge Graphs for Image Classification 2017 Kenneth Marino
Ruslan Salakhutdinov
Abhinav Gupta
+ PDF Chat Graph Transformers: A Survey 2024 Ahsan Shehzad
Xia Feng
Shagufta Abid
Ciyuan Peng
Shuo Yu
Dongyu Zhang
Karin Verspoor
+ Deep Learning on Attributed Graphs: A Journey from Graphs to Their Embeddings and Back 2019 Martin Simonovsky
+ Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks 2020 Franco Manessi
Alessandro Rozza
+ Graph Propagation Transformer for Graph Representation Learning 2023 Zhe Chen
Hao Tan
Tao Wang
Tianrun Shen
Tong Lu
Qiuying Peng
Cheng Cheng
Yue Qi
+ Graph Propagation Transformer for Graph Representation Learning 2023 Zhe Chen
Hao Tan
Tao Wang
Tianrun Shen
Tong Lu
Qiuying Peng
Cheng Cheng
Yue Qi
+ Representing Long-Range Context for Graph Neural Networks with Global Attention 2022 Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica

Works Cited by This (159)

Action Title Year Authors
+ PDF Chat VQA: Visual Question Answering 2015 Stanislaw Antol
Aishwarya Agrawal
Jiasen Lu
Margaret Mitchell
Dhruv Batra
C. Lawrence Zitnick
Devi Parikh
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
+ Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding 2016 Gunnar A. Sigurdsson
Gül Varol
Xiaolong Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
+ PDF Chat Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs 2017 Federico Monti
Davide Boscaini
Jonathan Masci
Emanuele Rodolà
Jan Svoboda
Michael M. Bronstein
+ PDF Chat Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering 2017 Yash Goyal
Tejas Khot
Douglas Summers-Stay
Dhruv Batra
Devi Parikh
+ Modeling Relational Data with Graph Convolutional Networks 2018 Michael Schlichtkrull
Thomas Kipf
Peter Bloem
Rianne van den Berg
Ivan Titov
Max Welling
+ PDF Chat Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs 2017 Martin Simonovsky
Nikos Komodakis
+ The Kinetics Human Action Video Dataset 2017 Andrew Zisserman
João Carreira
Karen Simonyan
Will Kay
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
T.C. Green
Trevor Back
+ PDF Chat Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering 2018 Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Jay Gould
Lei Zhang
+ PDF Chat FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis 2018 Nitika Verma
Edmond Boyer
Jakob Verbeek