An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

Type: Article

Publication Date: 2017-02-12

Citations: 786

DOI: https://doi.org/10.1609/aaai.v31i1.11212

Abstract

Human action recognition is an important task in computer vision. Extracting discriminative spatial and temporal features to model the spatial and temporal evolutions of different actions plays a key role in accomplishing this task. In this work, we propose an end-to-end spatial and temporal attention model for human action recognition from skeleton data. We build our model on top of the Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM), which learns to selectively focus on discriminative joints of skeleton within each frame of the inputs and pays different levels of attention to the outputs of different frames. Furthermore, to ensure effective training of the network, we propose a regularized cross-entropy loss to drive the model learning process and develop a joint training strategy accordingly. Experimental results demonstrate the effectiveness of the proposed model, both on the small human action recognition dataset of SBU and the currently largest NTU dataset.

Locations

  • Proceedings of the AAAI Conference on Artificial Intelligence - View - PDF
  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data 2016 Sijie Song
Cuiling Lan
Junliang Xing
Wenjun Zeng
Jiaying Liu
+ Learning Coupled Spatial-temporal Attention for Skeleton-based Action Recognition 2019 Jiayun Wang
+ An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition 2019 Chenyang Si
Wentao Chen
Wei Wang
Liang Wang
Tieniu Tan
+ PDF Chat An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition 2019 Chenyang Si
Wentao Chen
Wei Wang
Liang Wang
Tieniu Tan
+ PDF Chat Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks 2017 Jun Liu
Gang Wang
Ling‐Yu Duan
Kamila Abdiyeva
Alex C. Kot
+ PDF Chat Self-Attention Network for Skeleton-based Human Action Recognition 2020 Sangwoo Cho
M. H. Maqbool
Fei Liu
Hassan Foroosh
+ Self-Attention Network for Skeleton-based Human Action Recognition 2019 Sangwoo Cho
M. H. Maqbool
Fei Liu
Hassan Foroosh
+ Memory Attention Networks for Skeleton-based Action Recognition 2018 Chunyu Xie
Ce Li
Baochang Zhang
Chen Chen
Jungong Han
Jianzhuang Liu
+ Memory Attention Networks for Skeleton-based Action Recognition 2018 Chunyu Xie
Ce Li
Baochang Zhang
Chen Chen
Jungong Han
Changqing Zou
Jianzhuang Liu
+ Action Recognition with Spatio-Temporal Visual Attention on Skeleton Image Sequences 2018 Zhengyuan Yang
Yuncheng Li
Shuicheng Yan
Jiebo Luo
+ Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks 2016 Wentao Zhu
Cuiling Lan
Junliang Xing
Wenjun Zeng
Yanghao Li
Li Shen
Xiaohui Xie
+ PDF Chat Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks 2016 Wentao Zhu
Cuiling Lan
Junliang Xing
Wenjun Zeng
Yanghao Li
Li Shen
Xiaohui Xie
+ Skeleton based Activity Recognition by Fusing Part-wise Spatio-temporal and Attention Driven Residues 2019 Chhavi Dhiman
Dinesh Kumar Vishwakarma
Paras Aggarwal
+ PDF Chat ARN-LSTM: A Multi-Stream Attention-Based Model for Action Recognition with Temporal Dynamics 2024 Chuanchuan Wang
Ahmad Sufril Azlan Mohmamed
Xiao Yang
Xiang Li
+ Skeleton-Based Relational Modeling for Action Recognition. 2018 Lin Li
Zheng Wu
Zhaoxiang Zhang
Yan Huang
Liang Wang
+ Action Recognition with Visual Attention on Skeleton Images 2018 Zhengyuan Yang
Yuncheng Li
Shuicheng Yan
Jiebo Luo
+ Skepxels: Spatio-temporal Image Representation of Human Skeleton Joints for Action Recognition 2017 Jian Liu
Naveed Akhtar
Ajmal Mian
+ PDF Chat Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates 2017 Jun Liu
Amir Shahroudy
Dong Xu
Alex C. Kot
Gang Wang
+ Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates 2017 Jun Liu
Amir Shahroudy
Dong Xu
Alex C. Kot
Gang Wang
+ PDF Chat Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition 2020 Lei Shi
Yifan Zhang
Jian Cheng
Hanqing Lu

Works That Cite This (173)

Action Title Year Authors
+ PDF Chat Towards Coding For Human And Machine Vision: A Scalable Image Coding Approach 2020 Yueyu Hu
Shuai Yang
Wenhan Yang
Ling‐Yu Duan
Jiaying Liu
+ PDF Chat Hierarchically Self-supervised Transformer for Human Skeleton Representation Learning 2022 Yuxiao Chen
L. Zhao
Jianbo Yuan
Yu Tian
Zhaoyang Xia
Shijie Geng
Ligong Han
Dimitris Metaxas
+ PDF Chat ViLP: Knowledge Exploration using Vision, Language, and Pose Embeddings for Video Action Recognition 2023 Soumyabrata Chaudhuri
Saumik Bhattacharya
+ PDF Chat DWnet: Deep-wide network for 3D action recognition 2020 Yonghao Dang
Fuxing Yang
Jianqin Yin
+ PDF Chat Interpretable 3D Human Action Analysis with Temporal Convolutional Networks 2017 Tae Soo Kim
Austin Reiter
+ PDF Chat Skeleton-based action recognition via spatial and temporal transformer networks 2021 Chiara Plizzari
Marco Cannici
Matteo Matteucci
+ Prompted Contrast with Masked Motion Modeling: Towards Versatile 3D Action Representation Learning 2023 Jiahang Zhang
Lilang Lin
Jiaying Liu
+ PDF Chat Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition 2020 Zhen Huang
Xu Shen
Xinmei Tian
Houqiang Li
Jianqiang Huang
Xian–Sheng Hua
+ PDF Chat IGFormer: Interaction Graph Transformer for Skeleton-Based Human Interaction Recognition 2022 Yunsheng Pang
Qiuhong Ke
Hossein Rahmani
James Bailey
Jun Liu
+ Video-based Human Action Recognition using Deep Learning: A Review 2022 Hieu H. Pham
Louahdi Khoudour
Alain Crouzil
Pablo Zegers
Sergio A. VelastĂ­n