1.河北大学网络空间安全与计算机学院,河北保定 071002
2.河北省机器视觉工程研究中心,河北保定 071002
[ "雷玉 女,1995年生,山西晋中人.河北大学网络空间安全与计算机学院硕士研究生,研究方向为深度学习、图像分割技术及其应用." ]
[ "崔振超(通讯作者) 男,1983年生,河北邯郸人,讲师.中国计算机学会会员,2007年于燕山大学获得学士学位,2010年于燕山大学获得硕士学位,2015年于哈尔滨工业大学获得博士学位.现为河北大学网络空间安全与计算机学院教师,主要从事人工智能、机器视觉方向研究. E-mail: cuizhenchao@gmail.com" ]
[ "陈丽萍 女,1974年生,河北保定人,讲师.1997年于河北农业大学获得学士学位,2000年获得硕士学位.现为河北大学网络空间安全与计算机学院教师,主要从事机器视觉方面的研究." ]
[ "陈向阳 女,1977年生,河南三门峡人,讲师.2000年毕业于燕山大学获得学士学位,2007年毕业于河北大学获得硕士学位.现为河北大学网络空间安全与计算机学院教师,研究方向为深度学习." ]
[ "王煜骁 男,1997年生,河北廊坊人.河北大学网络空间安全与计算机学院硕士研究生,研究方向为深度学习、图像分类." ]
收稿:2020-12-31,
修回:2021-03-27,
网络出版:2022-07-04,
移动端阅览
雷玉, 崔振超, 陈丽萍, 等. 基于IASPP-ResNet分割算法的手势识别[J/OL]. 电子学报, 2022,1-12.
Yu LEI, Zhen-chao CUI, Li-ping CHEN, et al. Hand Gesture Recognition Based on IASPP-ResNet Segmentation Algorithm[J/OL]. ACTA ELECTRONICA SINICA, 2022, 1-12.
雷玉, 崔振超, 陈丽萍, 等. 基于IASPP-ResNet分割算法的手势识别[J/OL]. 电子学报, 2022,1-12. DOI: 10.12263/DZXB.20210049.
Yu LEI, Zhen-chao CUI, Li-ping CHEN, et al. Hand Gesture Recognition Based on IASPP-ResNet Segmentation Algorithm[J/OL]. ACTA ELECTRONICA SINICA, 2022, 1-12. DOI: 10.12263/DZXB.20210049.
手势识别是计算机视觉领域中研究的重要领域,是人机交互领域的重要组成.由于其识别结果受到复杂背景的影响,手势识别面临着巨大挑战.为了解决复杂背景影响的问题,本文利用了密集分割+手势分类的组合型模型,提出了一种新的手势识别算法.在密集分割部分,本文提出了改进型空洞空间金字塔池化(Improved Atrous Spatial Pyramid Pooling, IASPP).IASPP通过密集的连接不同空洞率的空洞卷积获取了在不同视野上的手势多尺度信息,从而提高了特征表述的精确性.另外,为了融合不同层级上的细节和空间位置信息,提升整体网络的分割性能,本文将IASPP嵌入编码器-解码器结构中,提出了IASPP-ResNet手势分割算法.在手势识别部分,我们利用了深度卷积神经网络模型,获得了较高的识别率.实验结果表明,在目前常用的公共数据集上,与传统的机器学习方法以及基于深度学习的方法相比,IASPP-ResNet分割算法的准确率更高,并且本文提出的密集分割+手势分类的组合型模型在NUS-II数据集上的手势识别率可达98.63%,优于现有的手势识别算法.
Gesture recognition is an essential research area in the field of computer vision
and it is also a significant component of the human-computer interaction. Due to its recognition results can be influenced by complex backgrounds
gesture recognition faces huge challenges. To solve the problem that is affected by the complex background
this paper proposes a new gesture recognition algorithm based on the combined model of dense segmentation and gesture classification. In the dense segmentation part
this paper shows the Improved Atrous Spatial Pyramid Pooling (IASPP). IASPP is a pooling layer in a convolution neural network
which can obtain the refine features by connecting cascade model and parallel model in atrous spatial pyramid pooling. Otherwise
in order to improve the segmentation performance by integrating details and spatial location information at different levels
the IASPP was embedded in a ResNet with encoder-decoder
and we name the method the Improved Atrous Spatial Pyramid Pooling-ResNet (IASPP-ResNet) for gesture segmentation. In the part of gesture recognition
we use the deep convolutional neural network model to obtain a higher recognition rate. The experimental results show that the IASPP-ResNet segmentation algorithm has a higher accuracy rate on the commonly used public data sets
compared with the traditional machine learning methods as well as the deep learning-based methods
and the gesture recognition rate of the combined model of dense segmentation and gesture classification proposed in this paper can reach 98.63% on NUS-II dataset
which is superior to the existing gesture recognition algorithm.
王勇 , 王沙沙 , 田增山 , 等 . 基于FMCW雷达的双流融合神经网络手势识别方法 [J]. 电子学报 , 2019 , 47 ( 7 ): 1408 - 1415 .
WANG Y , WANG S S , TIAN Z S , et al . Two-stream fusion neural network approach for hand gesture recognition based on FMCW radar [J]. Acta Electronica Sinica , 2019 , 47 ( 7 ): 1408 - 1415 . (in Chinese)
SAYED U , MOFADDEL M A , BAKHEET S , et al . An elliptical boundary skin model for hand detection based on HSV color space [J]. Information Sciences Letters , 2018 , 7 ( 1 ): 13 - 17 .
LIU C , WANG J , ZHANG T , et al . Adaptive threshold gesture segmentation algorithm based on skin color [C]// Proceedings of 2016 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics(AMEII 2016) . Hangzhou : Computer Science and Electronic Technology International Society , 2016 : 1602 - 1605 .
ZHENG Y , ZHENG P . Hand segmentation based on improved gaussian mixture model [C]// 2015 International Conference on Computer Science and Applications (CSA) . Wuhan : IEEE , 2015 : 168 - 171 .
CONAIRE C O , O'CONNOR N E , SMEATON A F . Detector adaptation by maximising agreement between independent data sources [C]// Conference on Computer Vision and Pattern Recognition . Minneapolis, MN, USA : IEEE , 2007 : 1 - 6 .
WANG X , FANG Y , LI C , et al . Static gesture segmentation technique based on improved sobel operator [J]. The Journal of Engineering , 2019 , 2019 ( 22 ).
TOFIGHI G , MONADJEMI S A , GHASEM-AGHAEE N . Rapid hand posture recognition using adaptive histogram template of skin and hand edge contour [C]// Iranian Conference on Machine Vision and Image Processing . Isfahan : IEEE , 2010 : 1 - 5 .
CHEN D , LI G , SUN Y , et al . Fusion hand gesture segmentation and extraction based on CMOS sensor and 3D sensor [J]. International Journal of Wireless and Mobile Computing , 2017 , 12 ( 3 ): 305 - 312 .
LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston : IEEE , 2015 : 3431 - 3440 .
RONNEBERGER O , FISCHER P , BROX T . U-net: Convolutional networks for biomedical image segmentation [C]// International Conference on Medical Image Computing and Computer-assisted Intervention . Cham : Springer International Publishing , 2015 : 234 - 241 .
ZHAO H , SHI J , QI X , et al . Pyramid scene parsing network [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu : IEEE , 2017 : 2881 - 2890 .
CHEN L C , PAPANDREOU G , KOKKINOS I , et al . Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 40 ( 4 ): 834 - 848 .
CHEN L C , PAPANDREOU G , SCHROFF F , et al . Rethinking atrous convolution for semantic image segmentation [EB/OL]. ( 2017-12-05 )[ 2020-12-31 ]. https://arxiv.org/abs/1706.05587 https://arxiv.org/abs/1706.05587 .
CHEN L C , ZHU Y , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// Proceedings of the European Conference on Computer Vision . Munich : Springer , 2018 : 801 - 818 .
DADASHZADEH A , TARGHI A T , TAHMASBI M , et al . HGR-Net: A fusion network for hand gesture segmentation and recognition [J]. IET Computer Vision , 2019 , 13 ( 8 ): 700 - 707 .
卫保国 , 徐勇 , 刘金玮 , 等 . 融合SSD目标检测的自适应手势分割方法 [J]. 信号处理 , 2020 , 36 ( 07 ): 1038 - 1047 .
WEI B G , XU Y , LIU J W , et al . Adaptive gesture segmentation based on SSD object detection [J]. Journal of Signal Processing , 2020 , 36 ( 7 ): 1038 - 1047 . (in Chinese)
MOHANTY A , RAMBHATLA S S , SAHAY R R . Deep gesture: Static hand gesture recognition using CNN [C]// Proceedings of International Conference on Computer Vision and Image Processing . Singapore : Springer , 2017 : 449 - 461 .
YANG H L , XUAN S B , MO Y B . Hand gesture recognition based on convolution neural network [J]. Cluster Computing , 2019 , 22 ( 2 ): 2719 - 2729 .
XING K , DING Z , JIANG S , et al . Hand gesture recognition based on deep learning method [C]// 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC) . Guangzhou : IEEE , 2018 : 542 - 546 .
AMEEN S , VADERA S . A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images [J]. Expert Systems , 2017 , 34 ( 3 ): e12197 .
YU F , KOLTUN V . Multi-scale context aggregation by dilated convolutions [EB/OL]. ( 2016-04-30 )[ 2020-12-31 ]. https://arxiv.org/abs/1511.07122 https://arxiv.org/abs/1511.07122 .
YANG M K , YU K , ZHANG C , et al . DenseASPP for semantic segmentation in street scenes [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City : IEEE , 2018 . 3684 - 3692 .
李宝奇 , 贺昱曜 , 何灵蛟 , 等 . 基于全卷积神经网络的非对称并行语义分割模型 [J]. 电子学报 , 2019 , 47 ( 5 ): 1058 - 1064 .
LI B Q , HE Y Y , HE L J , et al . Asymmetric parallel semantic segmentation model based on full convolutional neural network [J]. Acta Electronica Sinica , 2019 , 47 ( 5 ): 1058 - 1064 . (in Chinese)
孟琭 , 徐磊 , 郭嘉阳 . 一种基于改进的MobileNetV2网络语义分割算法 [J]. 电子学报 , 2020 , 48 ( 9 ): 1769 - 1776 .
MENG L , XU L , GUO J Y . Semantic segmentation algorithm based on improved MobileNetV2 [J]. Acta Electronica Sinica , 2020 , 48 ( 9 ): 1769 - 1776 . (in Chinese)
ADITHYA V , RAJESH R . A deep convolutional neural network approach for static hand gesture recognition [J]. Procedia Computer Science , 2020 , 171 : 2353 - 2361 .
ZHANG Q , YANG M , KPALMA K , et al . Segmentation of hand posture against complex backgrounds based on saliency and skin colour detection [J]. IAENG International Journal of Computer Science , 2018 , 45 ( 3 ): 435 - 444 .
罗会兰 , 张云 . 基于深度网络的图像语义分割综述 [J]. 电子学报 , 2019 , 47 ( 10 ): 2211 - 2220 .
LUO H L , ZHANG Y . A survey of image semantic segmentation based on deep network [J]. Acta Electronica Sinica , 2019 , 47 ( 10 ): 2211 - 2220 . (in Chinese)
张庆锐 . 复杂场景下的手势分割算法研究 [D]. 山东 : 山东大学 , 2018 .
ZHANG Q R . Research on Hand Gesture Segmentation Algorithm with Complex Background [D]. Shan Dong : Shandong University , 2018 . (in Chinese)
GARCIA-GARCIA A , ORTS-ESCOLANO S , OPREA S , et al . A review on deep learning techniques applied to semantic segmentation [EB/OL]. ( 2017-04-22 )[ 2020-12-31 ]. https://arxiv.org/abs/1704.06857 https://arxiv.org/abs/1704.06857 .
KAWULOK M , KAWULOK J , NALEPA J . Spatial-based skin detection using discriminative skin-presence features [J]. Pattern Recognition Letters , 2014 , 41 : 3 - 13 .
KAWULOK M , KAWULOK J , SMOLKA B . Discriminative textural features for image and video colorization [J]. IEICE Transactions on Information and Systems , 2012 , 95 ( 7 ): 1722 - 1730 .
JONES M J , REHG J M . Statistical color models with application to skin detection [J]. International Journal of Computer Vision , 2002 , 46 ( 1 ): 81 - 96 .
CHENG M M , MITRA N J , HUANG X , et al . Global contrast based salient region detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2014 , 37 ( 3 ): 569 - 582 .
ZHAO H , QI X , SHEN X , et al . Icnet for real-time semantic segmentation on high-resolution images [C]// Proceedings of the European Conference on Computer Vision . Munich : Springer , 2018 : 405 - 420 .
TIAN Z , HE T , SHEN C H , et al . Decoders matter for semantic segmentation: Data-dependent decoding enables flexible feature aggregation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach : IEEE , 2019 : 3126 - 3135 .
HOU Q B , ZHANG L , CHENG M M , et al . Strip pooling: Rethinking spatial pooling for scene parsing [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle : IEEE , 2020 : 4002 - 4011
SUN J H , JI T T , ZHANG S B , et al . Research on the hand gesture recognition based on deep learning [C]// 2018 12th International Symposium on Antennas, Propagation and EM Theory (ISAPE) . Hangzhou : IEEE , 2018 : 1 - 4 .
ARENAS J O P , MORENO R J , BELEÑO R D H . Convolutional neural network with a DAG architecture for control of a robotic arm by means of hand gestures [J]. Contemporary Engineering Sciences , 2018 , 11 ( 12 ): 547 - 557 .
TAN Y S , LIM K M , TEE C , et al . Convolutional neural network with spatial pyramid pooling for hand gesture recognition [J]. Neural Computing and Applications , 2020 : 1 - 13 .
MEHTA S , RASTEGARI M , CASPI A , et al . Espnet: Eficient spatial pyramid of dilated convolutions for semantic segmentation [C]// Proceedings of the European Conference on Computer Vision (ECCV) . Munich : Springer , 2018 : 552 - 568 .
0
浏览量
7
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621