Abstract:According to the need for driver assistance systems and intelligent vehicles to quickly and accurately identify traffic police command gestures,the articulated features of traffic police gesture is firstly analyzed,and a model based on the key points and skeletons of the police gesture is established.Secondly,the convolutional posture machine (CPM) is introduced to extract the key points of the traffic police gesture.Then the relative lengths of the gesture skeletons and the angles between each skeleton w.r.t.gravity are extracted as the spatial context features of the traffic police gesture.Meanwhile,long-term memory (LSTM) is introduced to extract the temporal features of traffic police gestures.Finally,the Chinese traffic police gesture recognizer (CTPGR) based on CPM and LSTM is designed,and a two-hour traffic police gesture video is recorded to train and verify the CTPGR.Experimental results show that the CTPGR is capable of recognizing traffic police gestures with high accuracy,and is fast enough for online gesture prediction.
张丞, 何坚, 王伟东. 空间上下文与时序特征融合的交警指挥手势识别技术[J]. 电子学报, 2020, 48(5): 966-974.
ZHANG Cheng, HE Jian, WANG Wei-dong. Visual Recognition of Chinese Traffic Police Gestures Based on Spatial Context and Temporal Features. Acta Electronica Sinica, 2020, 48(5): 966-974.
[1] Wang B,Yuan T.Traffic police gesture recognition using accelerometer[A].IEEE SENSORS Conference[C].Lecce-Italy,2008.1080-1083.
[2] Yuan T,Wang B.Accelerometer-based Chinese traffic police gesture recognition system[J].Chinese Journal of Electronics,2010,19(2):270-274.
[3] 杜友田,陈峰,徐文立,等.基于视觉的人的运动识别综述[J].电子学报,2007,35(1):84-90. DU You-tian,CHEN Feng,XU Wen-li,et al.A survey on the vision-based human motion recognition[J].Acta Electronica Sinica,2007,35(1):84-90.(in Chinese)
[4] Cai Z,Guo F.Max-covering scheme for gesture recognition of Chinese traffic police[J].Pattern Analysis and Applications,2015,18(2):403-418.
[5] Guo F,Tang J,Cai Z.Automatic recognition of Chinese traffic police gesture based on max-covering scheme[J].Advances in Information Sciences and Service Sciences,2013,5(1):428.
[6] Eichner M,Ferrari V.Human pose co-estimation and applications[J].IEEE transactions on pattern analysis and machine intelligence,2012,34(11):2282-2288.
[7] Eichner M,Marin-Jimenez M,Zisserman A,et al.2d articulated human pose estimation and retrieval in (almost) unconstrained still images[J].International Journal of Computer Vision,2012,99(2):190-214.
[8] 管业鹏.复杂人机交互场景下的指势用户对象识别[J].电子学报,2014,42(11):2135-2141. GUAN Ye-peng.Pointing user recognition in human-computer interaction with cluttered scene[J].Acta Electronica Sinica,2014,42(11):2135-2141.(in Chinese)
[9] Kang H,Lee C W,Jung K.Recognition-based gesture spotting in video games[J].Pattern Recognition Letters,2004,25(15):1701-1714.
[10] 张友梅,常发亮,刘洪彬.基于3D人体骨架的动作识别[J].电子学报,2017,45(4):906-911. ZHANG You-mei,CHANG Fa-liang,LIU Hong-bin.Action recognition based on 3D skeleton[J].Acta Electronica Sinica,2017,45(4):906-911.(in Chinese)
[11] Zhang Z.Microsoft kinect sensor and its effect[J].IEEE Multimedia,2012,19(2):4-10.
[12] Guo F,Tang J,Wang X.Gesture recognition of traffic police based on static and dynamic descriptor fusion[J].Multimedia Tools and Applications,2017,76(6):8915-8936.
[13] Le Q K,Pham C H,Le T H.Road traffic control gesture recognition using depth images[J].IEIE Transactions on Smart Processing & Computing,2012,1(1):1-7.
[14] Zhou Z,Li S,Sun B.Extreme learning machine based hand posture recognition in color-depth image[A].Chinese Conference on Pattern Recognition[C].Springer,2014.276-285.
[15] 罗会兰,童康,孔繁胜.基于深度学习的视频中人体动作识别进展综述[J].电子学报,2019,47(5):1162-1173. LUO Hui-lan,TONG Kang,KONG Fan-sheng.The progress of human action recognition in videos based on deep learning:a review[J].Acta Electronica Sinica,2019,47(5):1162-1173.(in Chinese)
[16] Fragkiadaki K,Levine S,Felsen P,et al.Recurrent network models for human dynamics[A].Proceedings of the IEEE International Conference on Computer Vision[C].US:IEEE,2015.4346-4354.
[17] Wei S-E,Ramakrishna V,Kanade T,et al.Convolutional pose machines[A].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition[C].US:IEEE,2016.4724-4732.
[18] Cao Z,Simon T,Wei S-E,et al.Realtime multi-person 2d pose estimation using part affinity fields[A].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition[C].US:IEEE,2017.7291-7299.
[19] Ramakrishna V,Munoz D,Hebert M,et al.Pose machines:Articulated pose estimation via inference machines[A].European Conference on Computer Vision[C].Berlin:Springer,2014.33-47.
[20] Wu J,Zheng H,Zhao B,et al.Large-scale datasets for going deeper in image understanding[A].2019 IEEE International Conference on Multimedia and Expo (ICME)[C].US:IEEE,2019.1480-1485.
[21] Glorot X,Bengio Y.Understanding the difficulty of training deep feedforward neural networks[A].Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics[C].US:ACM,2010.249-256.
[22] Pigou L,Van Den Oord A,Dieleman S,et al.Beyond temporal pooling:Recurrence and temporal convolutions for gesture recognition in video[J].International Journal of Computer Vision,2018,126(2-4):430-439.
[23] He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[A].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition[C].US:IEEE,2016.770-778.
[24] Huang G,Liu Z,Van Der Maaten L,et al.Densely connected convolutional networks[A].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition[C].US:IEEE,2017.4700-4708.
[25] Ji S,Xu W,Yang M,et al.3D convolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):221-231.
[26] Xingjian S H I,Chen Z,Wang H,et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[A].Advances in Neural Information Processing Systems[C].CSDN,2015.802-810.
[27] Guo F,Cai Z,Tang J.Chinese traffic police gesture recognition in complex scene[A].IEEE 10th International Conference on Trust,Security and Privacy in Computing and Communications[C].US:IEEE,2011.1505-1511.