Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network

ZHAO Jun-nan; SHE Qing-shan; MENG Ming; CHEN Yun

doi:10.12263/DZXB.20210416

您当前的位置：

首页 >

文章列表页 >

Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network

PAPERS | 更新时间：2025-12-08

- Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network
- ACTA ELECTRONICA SINICA Vol. 50, Issue 7, Pages: 1579-1585(2022)
- 作者机构：
  
  杭州电子科技大学自动化学院，浙江杭州 310018
- 作者简介：
- 基金信息：
- DOI：10.12263/DZXB.20210416
  CLC： TP391.4
- Received：29 March 2021，
  
  Revised：2021-12-30，
  
  Published：25 July 2022
- 稿件说明：
移动端阅览
赵俊男,佘青山,孟明等.基于多流空间注意力图卷积SRU网络的骨架动作识别[J].电子学报,2022,50(07):1579-1585.

ZHAO Jun-nan,SHE Qing-shan,MENG Ming,et al.Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network[J].ACTA ELECTRONICA SINICA,2022,50(07):1579-1585.
赵俊男,佘青山,孟明等.基于多流空间注意力图卷积SRU网络的骨架动作识别[J].电子学报,2022,50(07):1579-1585. DOI： 10.12263/DZXB.20210416.

ZHAO Jun-nan,SHE Qing-shan,MENG Ming,et al.Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network[J].ACTA ELECTRONICA SINICA,2022,50(07):1579-1585. DOI： 10.12263/DZXB.20210416.

摘要

基于骨架的动作识别越来越受到重视.针对现有算法推理速度慢、数据模式单一等问题，本文提出了一种轻量且高效的方法.该网络在简单循环单元（Simple Recurrent Unit，SRU）中嵌入图卷积算子构建图卷积SRU（GC-SRU）模型，来捕获数据的时空域信息.同时，为了加强节点间的区分，采用空间注意力网络和多流数据融合方式，将GC-SRU拓展成多流空间注意力图卷积SRU（MSAGC-SRU）.最后，在公开数据集上进行实验分析.结果表明，本文方法在Northwestern-UCLA上的分类准确率达到了93.1%，模型FLOPs为4.4G；NTU RGB+D上的分类准确率在CV、CS评估协议下分别达到92.7%和87.3%，模型FLOPs为21.3G，达到了计算效率和分类精度的良好平衡.

Abstract

Action recognition with skeleton data has attracted more attention. In order to solve the problems of low reasoning speed and single data mode of most algorithms

a lightweight and efficient method is proposed. The network embeds the graph convolution operator in the simple recurrent unit(SRU) to construct the graph convolutional SRU(GC-SRU)

which can capture the spatial-temporal information of data. Meanwhile

to enhance the distinction between nodes

spatial attention network and multi-stream data fusion are used to expand GC-SRU into multi-stream spatial attention graph convolutional SRU(MSAGC-SRU). Finally

the proposed method is evaluated on two public datasets. Experimental results show that the classification accuracy of our method on Northwestern-UCLA reaches 93.1% and the FLOPs of the model is 4.4G. The accuracy on NTU RGB+D reaches 92.7% and 87.3% under the CV and CS evaluation protocols

respectively

and the FLOPs of the model is 21.3G. The proposed model has achieved good trade-off between computational efficiency and classification accuracy.

关键词

Keywords

references

罗会兰 , 童康 , 孔繁胜 . 基于深度学习的视频中人体动作识别进展综述 [J]. 电子学报 , 2019 , 47 ( 5 ): 1162 - 1173 .

LUO Hui-lan , TONG Kang , KONG Fan-sheng . Review of human action recognition in videos based on deep learning [J]. Acta Electronica Sinica , 2019 , 47 ( 5 ): 1162 - 1173 . (in Chinese)

YAN S , XIONG Y , LIN D , et al . Spatial temporal graph convolutional networks for skeleton-based action recognition [C]// Proceedings of the AAAI Conference on Artificial Intelligence . New Orleans : AAAI , 2018 : 7444 - 7452 .

SI C , CHEN W , WANG W , et al . An attention enhanced graph convolutional LSTM network for skeleton-based action recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 1227 - 1236 .

SHI L , ZHANG Y , CHENG J , et al . Skeleton-based action recognition with multi-stream adaptive graph convolutional networks [J]. IEEE Transactions on Image Processing , 2020 , 29 : 9532 - 9545 .

LEI T , ZHANG Y , WANG S I , et al . Simple recurrent units for highly parallelizable recurrence [EB/OL]. ( 2018 )[2021]. https://arxiv.org/abs/1709.02755v5 https://arxiv.org/abs/1709.02755v5 .

SHE Q , MU G , GAN H , et al . Spatio-temporal SRU with global context-aware attention for 3D human action recognition [J]. Multimedia Tools and Applications , 2020 , 79 ( 17-18 ): 12349 - 12371 .

PARK C , LEE C , HONG L , et al . S2-Net: Machine reading comprehension with SRU based self-matching networks [J]. ETRI Journal , 2019 , 41 ( 3 ): 371 - 382 .

ZHU W , LAN C , XING J , et al . Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks [C]// Proceedings of the AAAI Conference on Artificial Intelligence . Phoenix : AAAI , 2016 : 3697 - 3703 .

ZHANG L , ZHU G , MEI L , et al . Attention in convolutional LSTM for gesture recognition [C]// Proceedings of the Advances in Neural Information Processing Systems . Montréal : Curran Associates Inc. , 2018 : 1953 - 1962 .

Di Gangi M A , Federico M . Deep neural machine translation with weakly-recurrent units [EB/OL]. ( 2018 )[2021]. https://arxiv.org/abs/1805.04185 https://arxiv.org/abs/1805.04185 .

SONG S , LAN C , XING J , et al . An end-to-end spatio-temporal attention model for human action recognition from skeleton data [C]// Proceedings of the AAAI Conference on Artificial Intelligence . San Francisco : AAAI , 2017 : 4263 - 4270 .

朱红蕾 , 朱昶胜 , 徐志刚 . 人体行为识别数据集研究进展 [J]. 自动化学报 , 2018 , 44 ( 6 ): 978 - 1004 .

ZHU Hong-lei , ZHU Chang-sheng , XU Zhi-gang . Research advances on human activity recognition datasets [J]. Acta Automatica Sinica , 2018 , 44 ( 6 ): 978 - 1004 . (in Chinese)

XIE C , LI C , ZHANG B , et al . Memory attention networks for skeleton-based action recognition [EB/OL]. ( 2018 )[2021]. https://arxiv.org/abs/1804.08254 https://arxiv.org/abs/1804.08254 .

WANG J , NIE X , XIA Y , et al . Cross-view action modeling, learning and recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Columbus : IEEE , 2014 : 2649 - 2656 .

CHENG K , ZHANG Y , HE X , et al . Skeleton-based action recognition with shift graph convolutional Network [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle : IEEE , 2020 : 183 - 192 .

穆高原 . 基于深度学习的危险驾驶行为识别研究 [D]. 杭州 : 杭州电子科技大学 , 2020 .

MU Gao-yuan . Study on dangerous driving behavior recognition based on deep learning [D]. Hangzhou : Hangzhou Dianzi University , 2020 . (in Chinese)

Vemulapalli R , Arrate F , Chellappa R . Human action recognition by representing 3D skeletons as points in a lie group [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Columbus : IEEE , 2014 : 588 - 595 .

WANG J , LIU Z , WU Y , et al . Learning actionlet ensemble for 3D human action recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2013 , 36 ( 5 ): 914 - 927 .

DU Y , WANG W , WANG L . Hierarchical recurrent neural network for skeleton based action recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston : IEEE , 2015 : 1110 - 1118 .

LIU M , LIU H , CHEN C . Enhanced skeleton visualization for view invariant human action recognition [J]. Pattern Recognition , 2017 , 68 : 346 - 362 .

LEE D Kim , KANG S , LEE S . Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice : IEEE , 2017 : 1012 - 1020 .

SHI L , ZHANG Y , CHENG J , et al . Two-stream adaptive graph convolutional networks for skeleton-based action recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 12026 - 12035 .

SHAHROUDY A , LIU J , NG T T , et al . Nturgb+d: A large scale dataset for 3D human activity analysis [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas : IEEE , 2016 : 1010 - 1019 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Multi-Dimensional Dynamic Topology Learning Graph Convolution for Skeleton-Based Action Recognition

Neighborhood and Hypergraph Collaboration for Session-Based Recommendation

Object Detection Based on EIMYOLO for High-Resolution Remote Sensing Images

Related Author

Jun-nan ZHAO

Qing-shan SHE

Ming MENG

Yun CHEN

LUO Hui-lan

CAO Li-jing

CHEN Rong-yuan

WEN Jie-bin

Related Institution

College of Automation，Hangzhou Dianzi University

School of Information Engineering， Jiangxi University of Science and Technology

College of Frontier Intersection, Hunan University of Technology and Business

Key Laboratory of Hunan Province for Statistical Learning and Intelligent Computation, Hunan University of Technology and Business

School of Computer Science, Hunan University of Technology and Business

⁰