End-to-End Human Pose and Mesh Reconstruction with Transformers

Type: Article

Publication Date: 2021-06-01

Citations: 506

DOI: https://doi.org/10.1109/cvpr46437.2021.00199

Abstract

We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image. Our method uses a transformer encoder to jointly model vertex-vertex and vertex-joint interactions, and outputs 3D joint coordinates and mesh vertices simultaneously. Compared to existing techniques that regress pose and shape parameters, METRO does not rely on any parametric mesh models like SMPL, thus it can be easily extended to other objects such as hands. We further relax the mesh topology and allow the transformer self-attention mechanism to freely attend between any two vertices, making it possible to learn non-local relationships among mesh vertices and joints. With the proposed masked vertex modeling, our method is more robust and effective in handling challenging situations like partial occlusions. METRO generates new state-of-the-art results for human mesh reconstruction on the public Human3.6M and 3DPW datasets. Moreover, we demonstrate the generalizability of METRO to 3D hand reconstruction in the wild, outperforming existing state-of-the-art methods on FreiHAND dataset.

Locations

  • arXiv (Cornell University) - View - PDF
  • 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - View

Similar Works

Action Title Year Authors
+ End-to-End Human Pose and Mesh Reconstruction with Transformers 2020 Kevin Lin
Lijuan Wang
Zicheng Liu
+ Leveraging the Learnable Vertex-Vertex Relationship to Generalize Human Pose and Mesh Reconstruction for In-the-Wild Scenes 2022 Trung Tran-Quang
Cuong Than-Cao
Hai Nguyen-Thanh
Hoang Si Hong
+ PDF Chat Leveraging the Learnable Vertex-Vertex Relationship to Generalize Human Pose and Mesh Reconstruction for In-the-Wild Scenes 2022 Trung Quang Tran
Cuong Cao Than
Hai Thanh Nguyen
Hoang Si Hong
+ PDF Chat MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh Reconstruction 2024 Kevin Lin
Chung-Ching Lin
Lin Liang
Zicheng Liu
Lijuan Wang
+ MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh Reconstruction 2022 Kevin Lin
Chung-Ching Lin
Liang Lin
Zicheng Liu
Lijuan Wang
+ PDF Chat SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation 2024 Xiangyu Xu
Lijuan Liu
Shuicheng Yan
+ PDF Chat A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose 2022 Ce Zheng
MatĂ­as Mendieta
Pu Wang
Aidong Lu
Chen Chen
+ A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose 2021 Ce Zheng
MatĂ­as Mendieta
Pu Wang
Aidong Lu
Chen Chen
+ PDF Chat PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery 2024 Wendi Yang
Zihang Jiang
Shang Zhao
S. Kevin Zhou
+ Mesh Graphormer 2021 Kevin Lin
Lijuan Wang
Zicheng Liu
+ Mesh Graphormer 2021 Kevin Lin
Lijuan Wang
Zicheng Liu
+ PDF Chat FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER 2023 Ce Zheng
MatĂ­as Mendieta
Taojiannan Yang
Guo-Jun Qi
Chen Chen
+ FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER 2022 Ce Zheng
MatĂ­as Mendieta
Taojiannan Yang
Guo-Jun Qi
Chen Chen
+ Dual Grid Net: hand mesh vertex regression from single depth maps 2019 Chengde Wan
Thomas Probst
Luc Van Gool
Angela Yao
+ PDF Chat Learning Human Mesh Recovery in 3D Scenes 2023 Zehong Shen
Zhi Cen
Sida Peng
Qing Shuai
Hujun Bao
Xiaowei Zhou
+ Learning Human Mesh Recovery in 3D Scenes 2023 Zehong Shen
Zhi Cen
Sida Peng
Qing Shuai
Hujun Bao
Xiaowei Zhou
+ Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose 2020 Hongsuk Choi
Gyeongsik Moon
Kyoung Mu Lee
+ Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose 2020 Hongsuk Choi
Gyeongsik Moon
Kyoung Mu Lee
+ Pixel-Aligned Non-parametric Hand Mesh Reconstruction 2022 Shijian Jiang
Guwen Han
Danhang Tang
Yang Zhou
Xiang Li
Jiming Chen
Qi Ye
+ PDF Chat THOR-Net: End-to-end Graformer-based Realistic Two Hands and Object Reconstruction with Self-supervision 2023 Ahmed Tawfik Aboukhadra
Jameel Malik
Ahmed Elhayek
Nadia Robertini
Didier Stricker

Works That Cite This (202)

Action Title Year Authors
+ PDF Chat RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-Consistent Dataset 2023 Zhongjin Luo
Shengcai Cai
Jinguo Dong
Ruibo Ming
Liangdong Qiu
Xiaohang Zhan
Xiaoguang Han
+ PDF Chat A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose 2022 Ce Zheng
MatĂ­as Mendieta
Pu Wang
Aidong Lu
Chen Chen
+ PDF Chat Regular Splitting Graph Network for 3D Human Pose Estimation 2023 Md. Tanvir Hassan
A. Ben Hamza
+ PDF Chat FAN-Trans: Online Knowledge Distillation for Facial Action Unit Detection 2023 Jing Yang
Jie Shen
Yiming Lin
Yordan Hristov
Maja Pantić
+ PDF Chat End-to-end weakly-supervised single-stage multiple 3D hand mesh reconstruction from a single RGB image 2023 Jinwei Ren
Jianke Zhu
Jialiang Zhang
+ CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting 2023 Shaoxiang Guo
Qing Cai
Lin Qi
Junyu Dong
+ PDF Chat Spatially Multi-conditional Image Generation 2023 Nikola Popović
Ritika Chakraborty
Danda Pani Paudel
Thomas Probst
Luc Van Gool
+ PDF Chat Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers 2023 Moritz Einfalt
Katja Ludwig
Rainer Lienhart
+ PDF Chat Learnable Human Mesh Triangulation for 3D Human Pose and Shape Estimation 2023 Sung-Ho Chun
Sungbum Park
Ju Yong Chang
+ PDF Chat Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats 2023 István Sárándi
Alexander Hermans
Bastian Leibe