

浏览全部资源
扫码关注微信
杭州电子科技大学自动化学院,浙江杭州 310018
Received:29 March 2021,
Revised:2021-12-30,
Published:25 July 2022
移动端阅览
赵俊男,佘青山,孟明等.基于多流空间注意力图卷积SRU网络的骨架动作识别[J].电子学报,2022,50(07):1579-1585.
ZHAO Jun-nan,SHE Qing-shan,MENG Ming,et al.Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network[J].ACTA ELECTRONICA SINICA,2022,50(07):1579-1585.
赵俊男,佘青山,孟明等.基于多流空间注意力图卷积SRU网络的骨架动作识别[J].电子学报,2022,50(07):1579-1585. DOI: 10.12263/DZXB.20210416.
ZHAO Jun-nan,SHE Qing-shan,MENG Ming,et al.Skeleton Action Recognition Based on Multi-Stream Spatial Attention Graph Convolutional SRU Network[J].ACTA ELECTRONICA SINICA,2022,50(07):1579-1585. DOI: 10.12263/DZXB.20210416.
基于骨架的动作识别越来越受到重视.针对现有算法推理速度慢、数据模式单一等问题,本文提出了一种轻量且高效的方法.该网络在简单循环单元(Simple Recurrent Unit,SRU)中嵌入图卷积算子构建图卷积SRU(GC-SRU)模型,来捕获数据的时空域信息.同时,为了加强节点间的区分,采用空间注意力网络和多流数据融合方式,将GC-SRU拓展成多流空间注意力图卷积SRU(MSAGC-SRU).最后,在公开数据集上进行实验分析.结果表明,本文方法在Northwestern-UCLA上的分类准确率达到了93.1%,模型FLOPs为4.4G;NTU RGB+D上的分类准确率在CV、CS评估协议下分别达到92.7%和87.3%,模型FLOPs为21.3G,达到了计算效率和分类精度的良好平衡.
Action recognition with skeleton data has attracted more attention. In order to solve the problems of low reasoning speed and single data mode of most algorithms
a lightweight and efficient method is proposed. The network embeds the graph convolution operator in the simple recurrent unit(SRU) to construct the graph convolutional SRU(GC-SRU)
which can capture the spatial-temporal information of data. Meanwhile
to enhance the distinction between nodes
spatial attention network and multi-stream data fusion are used to expand GC-SRU into multi-stream spatial attention graph convolutional SRU(MSAGC-SRU). Finally
the proposed method is evaluated on two public datasets. Experimental results show that the classification accuracy of our method on Northwestern-UCLA reaches 93.1% and the FLOPs of the model is 4.4G. The accuracy on NTU RGB+D reaches 92.7% and 87.3% under the CV and CS evaluation protocols
respectively
and the FLOPs of the model is 21.3G. The proposed model has achieved good trade-off between computational efficiency and classification accuracy.
罗会兰 , 童康 , 孔繁胜 . 基于深度学习的视频中人体动作识别进展综述 [J]. 电子学报 , 2019 , 47 ( 5 ): 1162 - 1173 .
LUO Hui-lan , TONG Kang , KONG Fan-sheng . Review of human action recognition in videos based on deep learning [J]. Acta Electronica Sinica , 2019 , 47 ( 5 ): 1162 - 1173 . (in Chinese)
YAN S , XIONG Y , LIN D , et al . Spatial temporal graph convolutional networks for skeleton-based action recognition [C]// Proceedings of the AAAI Conference on Artificial Intelligence . New Orleans : AAAI , 2018 : 7444 - 7452 .
SI C , CHEN W , WANG W , et al . An attention enhanced graph convolutional LSTM network for skeleton-based action recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 1227 - 1236 .
SHI L , ZHANG Y , CHENG J , et al . Skeleton-based action recognition with multi-stream adaptive graph convolutional networks [J]. IEEE Transactions on Image Processing , 2020 , 29 : 9532 - 9545 .
LEI T , ZHANG Y , WANG S I , et al . Simple recurrent units for highly parallelizable recurrence [EB/OL]. ( 2018 )[2021]. https://arxiv.org/abs/1709.02755v5 https://arxiv.org/abs/1709.02755v5 .
SHE Q , MU G , GAN H , et al . Spatio-temporal SRU with global context-aware attention for 3D human action recognition [J]. Multimedia Tools and Applications , 2020 , 79 ( 17-18 ): 12349 - 12371 .
PARK C , LEE C , HONG L , et al . S2-Net: Machine reading comprehension with SRU based self-matching networks [J]. ETRI Journal , 2019 , 41 ( 3 ): 371 - 382 .
ZHU W , LAN C , XING J , et al . Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks [C]// Proceedings of the AAAI Conference on Artificial Intelligence . Phoenix : AAAI , 2016 : 3697 - 3703 .
ZHANG L , ZHU G , MEI L , et al . Attention in convolutional LSTM for gesture recognition [C]// Proceedings of the Advances in Neural Information Processing Systems . Montréal : Curran Associates Inc. , 2018 : 1953 - 1962 .
Di Gangi M A , Federico M . Deep neural machine translation with weakly-recurrent units [EB/OL]. ( 2018 )[2021]. https://arxiv.org/abs/1805.04185 https://arxiv.org/abs/1805.04185 .
SONG S , LAN C , XING J , et al . An end-to-end spatio-temporal attention model for human action recognition from skeleton data [C]// Proceedings of the AAAI Conference on Artificial Intelligence . San Francisco : AAAI , 2017 : 4263 - 4270 .
朱红蕾 , 朱昶胜 , 徐志刚 . 人体行为识别数据集研究进展 [J]. 自动化学报 , 2018 , 44 ( 6 ): 978 - 1004 .
ZHU Hong-lei , ZHU Chang-sheng , XU Zhi-gang . Research advances on human activity recognition datasets [J]. Acta Automatica Sinica , 2018 , 44 ( 6 ): 978 - 1004 . (in Chinese)
XIE C , LI C , ZHANG B , et al . Memory attention networks for skeleton-based action recognition [EB/OL]. ( 2018 )[2021]. https://arxiv.org/abs/1804.08254 https://arxiv.org/abs/1804.08254 .
WANG J , NIE X , XIA Y , et al . Cross-view action modeling, learning and recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Columbus : IEEE , 2014 : 2649 - 2656 .
CHENG K , ZHANG Y , HE X , et al . Skeleton-based action recognition with shift graph convolutional Network [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle : IEEE , 2020 : 183 - 192 .
穆高原 . 基于深度学习的危险驾驶行为识别研究 [D]. 杭州 : 杭州电子科技大学 , 2020 .
MU Gao-yuan . Study on dangerous driving behavior recognition based on deep learning [D]. Hangzhou : Hangzhou Dianzi University , 2020 . (in Chinese)
Vemulapalli R , Arrate F , Chellappa R . Human action recognition by representing 3D skeletons as points in a lie group [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Columbus : IEEE , 2014 : 588 - 595 .
WANG J , LIU Z , WU Y , et al . Learning actionlet ensemble for 3D human action recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2013 , 36 ( 5 ): 914 - 927 .
DU Y , WANG W , WANG L . Hierarchical recurrent neural network for skeleton based action recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston : IEEE , 2015 : 1110 - 1118 .
LIU M , LIU H , CHEN C . Enhanced skeleton visualization for view invariant human action recognition [J]. Pattern Recognition , 2017 , 68 : 346 - 362 .
LEE D Kim , KANG S , LEE S . Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice : IEEE , 2017 : 1012 - 1020 .
SHI L , ZHANG Y , CHENG J , et al . Two-stream adaptive graph convolutional networks for skeleton-based action recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Long Beach : IEEE , 2019 : 12026 - 12035 .
SHAHROUDY A , LIU J , NG T T , et al . Nturgb+d: A large scale dataset for 3D human activity analysis [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas : IEEE , 2016 : 1010 - 1019 .
0
Views
10
下载量
4
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621