Dynamic Facial Expression Recognition Based on Multi-Visual and Audio Descriptors
LI Hong-fei1,2, LI Qing1,2, ZHOU Li1,2
1. Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China;
2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:Communication in any form either verbal or non-verbal is vital to complete various daily routine tasks and plays a significant role in life. Facial expression is the most effective form of non-verbal communication and it provides a clue about emotional state,mindset and intention.Till now,facial expression recognition has been successfully applied to various fields such as safe driving,merchandise sales,clinical medicine,and so on.This thesis explores key techniques related to facial expression recognition.The main work and contributions are as follows.A dynamic facial expression recognition algorithm based on multi-visual descriptors and audio features is proposed under unrestricted conditions,in which dynamic facial feature extraction was conducted based on local spatial-temporal feature representation via multi-visual descriptors.Furthermore,the combination of video and audio features improves the recognition performance.Dynamic time warping based on timeline segmentation and covariance matrix proves to be effective in analyzing dynamic expression sequences of different time duration.To improve the generalization performance of facial expression recognition model,an integrated decision-making strategy based on weight voting by multiple individual recognition models is introduced.In order to effectively learning the weight for each individual recognition model,the method of voting weight learning by random re-sampling and the method of voting learning based on comparative advantages of individual recognition model are proposed.Finally the above ensemble model is applied and the recognition performance is further improved.Experiments on AFEW5.0 dataset validate the performance of the proposed dynamic facial expression algorithm.
[1] Liao C T,Duan C H,Lai S H.Accurate facial feature localization on expressional face images based on a graphical model approach[A].Advances in Multimedia Information Processing-PCM 2010-11th Pacific Rim Conference,Part Ⅱ[C].Berlin:Springer,2010.672-681.
[2] Liao S,Fan W,Chunga C S,et al.Facial expression recognition using advanced local binary patterns,Tsallis entropies and global appearance features[A].The 2006 IEEE International Conference on Image Processing[C].Atlanta:IEEE,2006.665-668.
[3] Tong Y,Wang Y,Zhu ZW,et al.Facial feature tracking using a multi-state hierarchical shape model under varying face pose and facial expression[A].The 18th International Conference on Pattern Recognition[C].HongKong:IEEE,2006.283-286.
[4] Ekman P,Rosenberg E L.What the Face Reveals:Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS)[M].New York:Oxford University Press,1997.
[5] 吴奇,申寻兵,傅小兰.微表情研究及其应用[J].心理科学进展,2010,18(9):1359-1368. Wu Qi,Shen Xuebing,Fu Xiaolan.Micro-expression and its applications[J].Advances in Psychological Science,2010,18(9):1359-1368.(in Chinese)
[6] 吴冉,任衍具.微表情的启动效应研究[J].应用心理学,2011,17(3):241-248. Wu Ran,Ren Yanju.Study on Priming effect of micro-expression[J].Chinese Journal of Applied Psychology,2011,17(3):241-248.(in Chinese)
[7] 梁静,颜文婧,吴奇,等.微表情研究的进展与展望[J].中国科学基金,2013,27(2):75-78,82. Liang Jing,Yan Wenjing,Wu Qi,et al.Recent advances and future trends in micro-expression research[J].Bulletin of National Science Foundation of China,2013,27(2):75-78,82.(in Chinese)
[8] Winnemöller H,Kyprianidis J E,Olsen S C.XDoG:An extended difference-of-Gaussians compendium including advanced image stylization[J].Computers & Graphics,2012,36(6):740-753.
[9] Guoying Z,Matti P.Dynamic texture recognition using local binary patterns with an application to facial expressions[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2007,29(6):915-928.
[10] Jain S,Bagga S,Hablani R,et al.Facial Expression Recognition Using Local Binary Patterns with Different Distance Measures[M].Intelligent Computing,Networking,and Informatics.Springer India,2014.853-862.
[11] Watanabe T,Ito S,Yokoi K.Co-occurrence histograms of oriented gradients for pedestrian detection[A].Advances in Image and Video Technology,Third Pacific Rim Symposium.Proceedings[C].Berlin:Springer,2009.37-47.(4)
[12] Dalal N,Triggs B,Schmid C.Human detection using oriented histograms of flow and appearance[A].Computer Vision-ECCV 2006,9th European Conference on Computer Vision,Proceedings,Part Ⅱ[C].Berlin:Springer,2006.428-441.(5)
[13] Horn B K P,Schunck B G.Determining optical flow[J].Artificial Intelligence,1980,17(81):185-203.
[14] Xu H,Tian Q,Wang Z,et al.A survey on aggregating methods for action recognition with dense trajectories[J].Multimedia Tools & Applications,2015,55(03):1-17.
[15] Uijlings J,Duta I C,Sangineto E,et al.Video classification with densely extracted HOG/HOF/MBH features:an evaluation of the accuracy/computational efficiency trade-off[J].International Journal of Multimedia Information Retrieval,2014,4(1):33-44.
[16] Eyben F,Wöllmer M,Schuller B.OpenEAR - Introducing the munich open-source emotion and affect recognition toolkit[A].20093rd International Conference on Affective Computing and Intelligent Interaction and Workshops[C].Amsterdam:IEEE,2009.1-6.
[17] Eyben F,Wöllmer M,Schuller B.Opensmile:the munich versatile and fast open-source audio feature extractor[J].Acm Mm,2010:1459-1462.
[18] 胡振兴.一种基于PCA类内平均脸法和支持向量机模型的人脸识别算法[J].软件导刊,2012,11(6):33-34. Hu Zhenxing.Face recognition algorithm based on PCA class mean face method and support vector machine model[J].Software Guide,2012,11(6):33-34.(in Chinese)
[19] Rosipal R.,Krämer N.Overview and recent advances in partial least squares[A].In Subspace,Latent Structure and Feature Selection[C].Berlin:Springer,2006.34-51.
[20] Dietterich,T.G.Ensemble methods in machine learning[A].Proceedings of the 1st International Workshop in Mutiple Classifier Systems[C].Berlin:Springer,2000.1-15
[21] 周志华.机器学习[M]. 北京:清华大学出版社,2015. Zhou Zhihua.Machine Learning[M].Beijing:Tsinghua University Press,2015.(in Chinese)
[22] Baruque B,Corchado E.Fusion Methods for Unsupervised Learning Ensembles[J].Studies in Computational Intelligence,2011,322.
[23] 刘志华,李改燕,刘晓爽.基于最小二乘法的蒙特卡洛移动节点定位算法[J].传感技术学报,2012,25(4):541-544. Liu Zhihua,Li Gaiyan,Liu Xiaochen.Monte Carlo mobile node localization algorithm based on least square method[J].Journal of sensing technology,2012,25(4):541-544.(in Chinese)
[24] Maji S,Berg A C,Malik J.Classification using intersection kernel support vector machines is efficient[A].2008 IEEE Conference on Computer Vision and Pattern Recognition[C].Anchorage:IEEE,2008.1-8.
[25] Buenaposada J M,Muñoz E,Baumela L.Recognising facial expressions in video sequences[J].Pattern Analysis & Applications,2008,11(1):101-116.
[26] Huang X,Zhao G,Pietikñinen M,et al.Dynamic facial expression recognition using boosted component-based spatiotemporal features and multi-classifier fusion[A].Advanced Concepts for Intelligent Vision Systems-12th International Conference,Proceedings,Part Ⅱ[C].Berlin:Springer,2010.312-322.
[27] Dubuisson S,Davoine F,Masson M.A solution for facial expression representation and recognition[J].Signal Processing Image Communication,2002,17(9):657-673.
[28] Liu C,Yuen J,Torralba A,et al.SIFT Flow:Dense correspondence across different scenes[A].Computer Vision-ECCV 2008,10th European Conference on Computer Vision,Proceedings,Part Ⅲ[C].Berlin:Springer,2008.28-42.
[29] 贲晛烨,杨明强,张鹏,李娟.微表情自动识别综述[J].计算机辅助设计与图形学2014,26(9):1385-1395. Ben Xianye,Yang Mingqiang,Zhang Peng,Li Juan.Survey on automatic micro expression recognition methods[J].Journal of Computer-Aided Design & Computer Graphics.2014,26(9):1385-1395.(in Chinese)