自然环境背景噪声下基于低维深度特征的手机来源识别

doi:10.12263/DZXB.20200658

PDF(8203 KB)

电子学报 ›› 2021, Vol. 49 ›› Issue (4) : 637-646. DOI: 10.12263/DZXB.20200658

学术论文

自然环境背景噪声下基于低维深度特征的手机来源识别

苏兆品^1,2,3,4, 吴张倩², 岳峰^2,4, 武钦芳², 张国富^1,2,3,4

作者信息 +

Source Cell-Phone Identification Under Background Noise Based on Low-Dimensional Deep Features

SU Zhao-pin^1,2,3,4, WU Zhang-qian², YUE Feng^2,4, WU Qin-fang², ZHANG Guo-fu^1,2,3,4

Author information +

文章历史 +

摘要

基于语音的手机来源识别是近年来多媒体取证领域中的一个研究热点，但已有研究大都局限于纯净语音或人工背景噪声语音.本文以自然环境背景噪声下的手机语音为研究对象，提出一种基于低维深度特征的手机来源识别方法.首先提取对数域的Mel滤波器组系数作为基本的声学特征，然后输入到时间卷积网络中进行训练，进一步提取能够表征语音设备的深度特征，并利用线性判别分析进行降维，去除高维深度特征中的冗余.最后，将得到的低维深度特征输入到支持向量机中进行分类和识别.在47种不同型号手机录制的37600条自然环境背景噪声语音样本库上的测试结果表明，本文所提方法在自然环境背景噪声下具有更优的识别性能，且对不同品牌、相同品牌不同型号、不同样本长度、不同数据集规模和不同采样率都具有很好的适应性.

Abstract

Identifying cell-phones using recorded speech has become a hot topic in the field of multimedia forensics in recent years. However, most of the existing studies focus on the clean speech or the speech with unnaturally artificial noise. In this paper, the speech with background noise is taken into account and a source cell-phone identification method is presented on the basis of the low-dimensional deep features. First, the logarithmic Mel-filter bank coefficients are extracted as the main acoustic features and input to the temporal convolutional network for training and further extracting the deep features of speech devices. Then, the linear discriminant analysis is used to reduce the size of the high-dimensional deep features and remove the redundancy. Finally, the low-dimensional deep features are used as input to the support vector machine classifier. The experimental results on 47 models of mobile phones and 37600 speech samples with background noise show that the proposed method has better recognition performance and better adaptability to different brands, different models of the same brand, different sampling lengths, different sizes of the dataset, and different sampling rates.

导出引用

苏兆品, 吴张倩, 岳峰, 武钦芳, 张国富. 自然环境背景噪声下基于低维深度特征的手机来源识别[J]. 电子学报, 2021, 49(4): 637-646. https://doi.org/10.12263/DZXB.20200658

SU Zhao-pin, WU Zhang-qian, YUE Feng, WU Qin-fang, ZHANG Guo-fu. Source Cell-Phone Identification Under Background Noise Based on Low-Dimensional Deep Features[J]. Acta Electronica Sinica, 2021, 49(4): 637-646. https://doi.org/10.12263/DZXB.20200658

中图分类号： TN912.3

参考文献

[1] 贺前华,王志锋,RUDNICKY A I,等.基于改进PNCC特征和两步区分性训练的录音设备识别方法[J].电子学报,2014,42(1):191-198. HE Qian-hua,WANG Zhi-feng,RUDNICKY A I,et al.A recording device identification algorithm based on improved PNCC feature and two-step discriminative training[J].Acta Electronica Sinica,2014,42(1):191-198.(in Chinese)
[2] ZOU L,YANG J,HUANG T.Automatic cell phone recognition from speech recordings[A].Proceedings of the 5th IEEE China Summit and International Conference on Signal and Information Processing[C].Xi' an,China:IEEE,2014.621-625.
[3] LUO D,KORUS P,HUANG J.Band energy difference for source attribution in audio forensics[J].IEEE Transactions on Information Forensics and Security,2018,13(9):2179-2189.
[4] QIN T,WANG R,YAN D,et al.Source cell-phone identification in the presence of additive noise from CQT domain[J].Information,2018,9(8):Article No.205.
[5] JIANG Y,LEUNG F H F.Mobile phone identification from speech recordings using weighted support vector machine[A].Proceedings of the 42nd Annual Conference of the IEEE Industrial Electronics Society[C].Florence,Italy:IEEE,2016.963-968.
[6] LI Y,ZHANG X,LI X,et al.Mobile phone clustering from speech recordings using deep representation and spectral clustering[J].IEEE Transactions on Information Forensics and Security,2018,13(4):965-977.
[7] VERMA V,KHATURIA P,KHANNA N.Cell-phone identification from recompressed audio recordings[A].Proceedings of the 24th National Conference on Communications[C].Hyderabad,India:IEEE,2018.1-6.
[8] QI S,HUANG Z,LI Y,et al.Audio recording device identification based on deep learning[A].Proceedings of the IEEE International Conference on Signal and Image Processing[C].Beijing,China:IEEE,2016.426-431.
[9] JIN C,WANG R,YAN D,et al.Source cell-phone identification using spectral features of device self-noise[A].Proceedings of the 15th International Workshop on Digital Watermarking[C].Beijing,China:Springer,2016.29-45.
[10] 裴安山,王让定,严迪群.基于设备本底噪声频谱特征的手机来源识别[J].电信科学,2017,33(1):85-94. PEI An-shan,WANG Rang-ding,YAN Di-qun.Cell-phone origin identification based on spectral features of device self-noise[J].Telecommunications Science,2017,33(1):85-94.(in Chinese)
[11] 裴安山,王让定,严迪群.基于语音静音段特征的手机来源识别方法[J].电信科学,2017,33(7):103-111. PEI An-shan,WANG Rang-ding,YAN Di-qun.Source cell-phone identification from recorded speech using non-speech segments[J].Telecommunications Science,2017,33(7):103-111.(in Chinese)
[12] BALDINI G,AMERINI I,GENTILE C.Microphone identification using convolutional neural networks[J].IEEE Sensors Letters,2019,3(7):Article No.6001504.
[13] BALDINI G,AMERINI I.Smartphones identification through the built-in microphones with convolutional neural network[J].IEEE Access,2019,7:158685-158696.
[14] BAI S,KOLTER J Z,KOLTUN V.An empirical evaluation of generic convolutional and recurrent networks for sequence modeling[OL].https://arxiv.org/abs/1803.01271,2018-04-19.
[15] ABBASIAN H,NASERSHARIF B,AKBARI A,et al.Optimized linear discriminant analysis for extracting robust speech features[A].Proceedings of the 3rd International Symposium on Communications,Control and Signal Processing[C].St Julians,Malta:IEEE,2008.819-824.
[16] CHANG C C,LIN C J.LIBSVM:A library for support vector machines[J].ACM Transactions on Intelligent Systems and Technology,2011,2(3):Article No.27.
[17] MCFEE B,RAFFEL C,LIANG D,et al.librosa:audio and music signal analysis in Python[A].Proceedings of the 14th Python in Science Conference[C].Austin,Texas,USA:SciPy Organizers,2015.18-25.
[18] IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[A].Proceedings of the 32nd International Conference on Machine Learning[C].Lille,France:JMLR.org,2015.448-456.
[19] GRANDINI M,BAGLI E,VISANI G.Metrics for multi-class classification:an overview[OL].https://arxiv.org/abs/2008.05756,2020-08-13.
[20] VAN DER MAATEN L,HINTON G.Visualizing data using t-SNE[J].Journal of Machine Learning Research,2008,9:2579-2605.
[21] KASUN L L C,YANG Y,HUANG G,et al.Dimension reduction with extreme learning machine[J].IEEE Transactions on Image Processing,2016,25(8):3906-3918.

基金

国家自然科学基金 (No.61573125）; 教育部人文社会科学研究青年基金 (No.19YJC870021，No.18YJC870025）; 安徽省重点研究与开发计划 (No.202004d07020011）; 中央高校基本科研业务费专项资金 (No.PA2020GDKC0015，No.PA2019GDQT0008，No.PA2019GDPK0072）