一种基于双流融合3D卷积神经网络的动态头势识别方法

谢佳龙; 张波涛; 吕强

doi:10.12263/DZXB.20201183

您当前的位置：

首页 >

文章列表页 >

一种基于双流融合3D卷积神经网络的动态头势识别方法

学术论文 | 更新时间：2025-12-08

- 一种基于双流融合3D卷积神经网络的动态头势识别方法
- A Dynamic Head Gesture Recognition Method Based on 3D Convolutional Two‑Stream Network Fusion
- 电子学报 2021年49卷第7期页码：1363-1369
- 作者机构：
  
  杭州电子科技大学自动化学院, 浙江杭州 310018
- 作者简介：
  
  [ "谢佳龙　男，1997年6月出生于四川省南充市. 现为杭州电子科技大学自动化学院研究生.研究方向为机器人控制与人机交互.E‑mail: jolen_xie@hotmail.com" ]
  [ "张波涛（通信作者）　男,1982年9月出生于山东省潍坊市. 现为杭州电子科技大学副教授. 主要研究方向为机器视觉、机器人运动规划与控制等.E‑mail: billow@hdu.edu.cn" ]
  [ "吕　强　男,1977年7月出生于辽宁省抚顺市.现为杭州电子科技大学教授,主要研究方向为多机器人合作控制、群体智能等.E‑mail: lvqiang@hdu.edu.cn" ]
- 基金信息：
  
  浙江省重点研发计划(2019C04018);国家自然科学基金(62073108)
- DOI：10.12263/DZXB.20201183
  中图分类号： TP183
- 收稿：2020-10-23，
  
  修回：2020-12-18，
  
  纸质出版：2021-07-25
- 稿件说明：
移动端阅览
谢佳龙,张波涛,吕强.一种基于双流融合3D卷积神经网络的动态头势识别方法[J].电子学报,2021,49(07):1363-1369.

XIE Jia‑long,ZHANG Bo‑tao,LÜ Qiang.A Dynamic Head Gesture Recognition Method Based on 3D Convolutional Two‑Stream Network Fusion[J].ACTA ELECTRONICA SINICA,2021,49(07):1363-1369.
谢佳龙,张波涛,吕强.一种基于双流融合3D卷积神经网络的动态头势识别方法[J].电子学报,2021,49(07):1363-1369. DOI： 10.12263/DZXB.20201183.

XIE Jia‑long,ZHANG Bo‑tao,LÜ Qiang.A Dynamic Head Gesture Recognition Method Based on 3D Convolutional Two‑Stream Network Fusion[J].ACTA ELECTRONICA SINICA,2021,49(07):1363-1369. DOI： 10.12263/DZXB.20201183.

摘要

目前基于视觉的动态头势识别算法泛化能力弱、识别率低

头戴式传感器的方法经济性、便携性差.针对以上问题

提出了一种无需头戴设备的动态头势识别算法.这种基于双流融合3D卷积神经网络的方法用头部动作生成稠密光流

并将原始数据和光流数据并行输入构建的动作特征提取器

最后进行特征融合.结果表明所提算法比人工特征提取方法和C3D模型有更高的准确率、更好的泛化能力

在无需头戴传感器的情况下有近似头戴式传感器的识别率.

Abstract

Present vision based on dynamic head gesture recognition algorithms usually have disadvantages in generalization and recognition rate

and head‑mounted sensors are expensive and inconvenient. In view of the above problems

a dynamic head gesture recognition algorithm without head‑mounted sensors is proposed. Using this method based on two‑stream 3DCNN(3D Convolutional Neural Network)

the dense optical flow is generated by head movements

the original data and dense optical flow are put into the motion feature extractor in parallel

and finally

features are fused. Experimental results show that the proposed algorithm has higher recognition accuracy and better generalization than the artificial feature extraction and C3D(Convolutional 3D) methods

and its recognition rate is as good as those head mounted sensors.

关键词

Keywords

references

ZHAO J B ， ALLISON R S . Real‑time head gesture recognition on head‑mounted displays using cascaded hidden Markov models ［A］. Proceedings of the 2017 IEEE International Conference on Systems ， Man and Cybernetics［C］. Banff ， AB，Canada ： IEEE ， 2017 . 2361 - 2366 .

VADIRAJ S K ， RAO A ， GHOSH P K . Automatic identification of speakers from head gestures in a narration ［A］. Proceedings of the 2020 IEEE International Conference on Acoustics， Speech and Signal Processing ［C］. Barcelona， Spain ： IEEE ， 2020 . 6314 - 6318 .

SHARMA M ， AHMETOVIC D ， JENI L A ， et al . Recognizing visual signatures of spontaneous head gestures ［A］. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision ［C］. Nevada，USA ： IEEE ， 2018 . 400 - 408 .

ZHAO J B ， ALLISON R S . Comparing head gesture hand gesture and gamepad interfaces for answering Yes/No questions in virtual environments ［J］. Virtual Reality ， 2020 ， 24 ： 515 - 524 .

FUJIE S ， EJIRI Y ， NAKAJIMA K ， et al . A conversation robot using head gesture recognition as para‑linguistic information ［A］. Proceedings of the RO‑MAN 2004 13th IEEE International Workshop on Robot and Human Interactive Communication ［C］. Kurashiki， Japan ： IEEE ， 2004 . 159 - 164 .

NG P C ， SILVA L C D . Head gestures recognition ［A］. Proceedings of the 2001 International Conference on Image Processing ［C］. Thessaloniki， Greece ： IEEE ， 2001 . 266 - 269 .

SOLEA R ， MARGARIT A ， CERNEGA D ， et al . Head movement control of powered wheelchair ［A］. Proceedings of the 23rd International Conference on System Theory ， Control and Computing ［C］. Sinaia ， Romania ： IEEE ， 2019 . 632 - 637 .

JACKOWSKI A ， GEBHARD M ， THIETJE R . Head motion and head gesture‑based robot control： A usability study ［J］. IEEE Transactions on Neural Systems and Rehabilitation Engineering ， 2018 ， 26 （ 1 ）： 161 - 170 .

WU C W ， YANG H Z ， CHEN Y A ， et al . Applying machine learning to head gesture recognition using wearables ［A］. Proceedings of the 2017 IEEE 8th International Conference on Awareness Science and Technology ［C］. Taichung， China ： IEEE ， 2017 . 436 - 440 .

JACKOWSKI A ， GEBHARD M ， GRÄSER A . A novel head gesture based interface for hands‑free control of a robot ［A］. Proceedings of the 2016 IEEE International Symposium on Medical Measurements and Applications ［C］. Benevento，Italy ： IEEE ， 2016 . 1 - 6 .

RUDIGKEIT N ， GEBHARD M ， GRÄSER A . An analytical approach for head gesture recognition with motion sensors ［A］. Proceedings of the 2015 9th International Conference on Sensing Technology ［C］. Auckland， New Zealand ： IEEE ， 2015 . 1 - 6 .

BANKAR R ， SALANKAR S . Improvement of head gesture recognition using Camshift based face tracking with UKF ［A］. Proceedings of the 2019 9th International Conference on Emerging Trends in Engineering and Tech‑ nology‑Signal and Information Processing ［C］. Nagpur， India ： IEEE ， 2019 . 1 - 5 .

LU P ， ZHANG M ， ZHU X ， et al . Head nod and shake recognition based on multi‑view model and hidden Markov model ［A］. Proceedings of the International Conference on Computer Graphics ， Imaging and Visualization［C］. Beijing， China ： IEEE ， 2005 . 61 - 64 .

HONG T ， LI Y W ， WANG Z Y . Real‑time head action recognition based on HOF and ELM ［J］. IEICE Transactions on Information and Systems ， 2019 ， 102 （ 1 ）： 206 - 209 .

罗会兰，童康，孔繁胜 . 基于深度学习的视频中人体动作识别进展综述［J］. 电子学报， 2019 ， 47 （ 5 ）： 1162 - 1173 .

LUO Hui‑lan ， TONG Kang ， KONG Fan‑sheng . The progress of human action recognition in videos based on deep learning： A review ［J］. Acta Electronica Sinica ， 2019 ， 47 （ 5 ）： 1162 - 1173 . （in Chinese）

FARNEBÄCK G . Two‑frame motion estimation based on polynomial expansion ［A］. Proceedings of the 13th Scandinavian Conference on Image Analysis ［C］. Halmstad， Sweden ： Springer ， 2003 . 363 - 370 .

LECUN Y ， HUANG F J ， BOTTOU L . Learning methods for generic object recognition with invariance to pose and lighting ［A］. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition ［C］. Washington，USA ： IEEE ， 2004 . 97 - 104 .

SIMONYAN K ， ZISSERMAN A . Two‑stream convolutional networks for action recognition in videos ［J］. Advances in Neural Information Processing Systems ， 2014 ， 27 ： 568 - 576 .

JI S ， XU W ， YANG M ， et al . 3D convolutional neural networks for human action recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2012 ， 35 （ 1 ）： 221 - 231 .

TRAN D ， BOUTDEV L ， FERGUS R ， et al . Learning spatiotemporal features with 3D convolutional networks ［A］. Proceedings of the 2015 IEEE International Conference on Computer Vision ［C］. Santiago，Chile ： IEEE ， 2015 . 4489 - 4497 .

FEICHTENHOFER C ， PINZ A ， ZISSERMAN A . Convolutional two‑stream network fusion for video action recognition ［A］. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition ［C］. Las Vegas， USA ： IEEE ， 2016 . 1933 - 1941 .

NAIR V ， HINTON G E . Rectified linear units improve restricted Boltzmann machines ［A］. Proceedings of the 27th International Conference on Machine Learning ［C］. Haifa， Israel ： ACM ， 2010 . 807 - 814 .

SRIVASTAVA N ， HINTON G ， KRIZHEVSKY A ， et al . Dropout： A simple way to prevent neural networks from overfitting ［J］. The Journal of Machine Learning Research ， 2014 ， 15 （ 1 ）： 1929 - 1958 .

KINGMA D P ， Ba J . Adam： A method for stochastic optimization ［A］. Proceedings of the 3rd International Conference on Learning Representations ［C］. San Diego， USA ： Conference Track Proceedings ， 2015 . 1884 - 2021 .

SUNI S S ， GOPAKUMAR K . A real time decision support system using head nod and shake ［A］. Proceedings of the 2016 International Conference on Circuit ， Power and Computing Technologies［C］. Nagercoil ， India ： IEEE ， 2016 . 1 - 5 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于多维动态拓扑学习图卷积的骨架动作识别

基于深度学习的视频中人体动作识别进展综述

LSCN:一种用于动作识别的长短时序关注网络

面向时序异常检测的可变视距多向扫描方法