The auditory quality of wideband audio is generally degraded due to the lack of the high-frequency in network transmission,so this paper presents a kind of audio bandwidth extension method from wideband to super wideband based on local least square support vector machine.In the light of the nonlinearity of audio spectrum,the high-frequency fine spectrum of audio signals is predicted by using phase space reconstruction and local least square support vector machine.Combining with the estimation of high-frequency sub-band energy based on Gaussian mixture model,the proposed method can effectively recover the high-frequency components in the frequency range 7kHz~14kHz through the envelope adjustment of high-frequency spectrum at last.Subjective and objective testing results indicate that the proposed method improves the auditory quality of wideband audio and outperforms the reference methods of audio bandwidth extension.
白海钏, 鲍长春, 刘鑫. 基于局部最小二乘支持向量机的音频频带扩展方法[J]. 电子学报, 2016, 44(9): 2203-2210.
BAI Hai-chuan, BAO Chang-chun, LIU Xin. Audio Bandwidth Extension Method Based on Local Least Square Support Vector Machine. Acta Electronica Sinica, 2016, 44(9): 2203-2210.
[1] ITU-T G.722.1 Annex C,Low Complexity Coding at 24 and 32 kb/s for Hands-free Operation in Systems with Low Frame Loss Annex C 14kHz Mode at 24,32 and 48 kb/s[S].2005.
[2] Peter Vary,Rainer Martin.Digital Speech Transmission-Enhancement.Coding and Error Concealment[M].UK:John Wiley & Sons Ltd,2006.
[3] 张勇,胡瑞敏.基于高斯混合模型的语音带宽扩展算法的研究[J].声学学报,2009,35(5):471-480. Zhang Yong,Hu Ruimin.Speech wideband extension based on Gaussian mixture model[J].Acta Acustica,2009,35(5):471-480.(in Chinese)
[4] Liu Xin,Bao Chang-chun.A harmonic bandwidth extension based on Gaussian mixture model[A].10th International Conference on Signal Processing[C].Beijing:IEEE,2010.474-477.
[5] Liu Xin,Bao Chang-chun.Nonlinear bandwidth extension of audio signals based on hidden Markov model[A].IEEE International Symposium on Signal Processing and Information Technology[C].Bilbao,Spain:IEEE,2011.144-149.
[6] Liu Hao-jie,Bao Chang-chun.Audio bandwidth extension based on RBF neural network[A].IEEE International Symposium on Signal Processing and Information Technology[C].Bilbao,Spain:IEEE,2011.150-154.
[7] Erik Larsen,Ronald M Aarts.Audio Bandwidth Extension-application of Psychoacoustics.Signal Processing and Loudspeaker Design[M].UK:John Wiley & Sons Ltd,2004.
[8] Frederik Nagel,Sascha Disch.A harmonic bandwidth extension method for audio codecs[A].IEEE International Conference on Acoustics,Speech and Signal Processing[C].Taiwan:IEEE,2009.145-148.
[9] Liu Xin,Bao Chang-chun.Nonlinear bandwidth extension based on nearest-neighbor matching[A].Asia-Pacific Signal and Information Processing Association[C].Singapore:APSIPA,2010.169-172.
[10] 刘鑫.宽带音频的非线性频带展宽技术[D].北京:北京工业大学电控学院,2011. Liu Xin.Nonlinear Bandwidth Extending for Wideband Audio[D].Beijing:Beijing University of Technology,2011.(in Chinese)
[11] 王海燕,卢山.非线性时间序列分析及其应用[M].北京:科技出版社,2006.10-11,12-16,102-103.
[12] 刘秉正,彭建华.非线性动力学[M].北京:高等教育出版社,2004.396-398,400-414,441-449.
[13] 韩敏.混沌时间序列预测理论与方法[M].北京:中国水利水电出版社,2007.155-172.
[14] Holger Kantz,Thomas Schreiber.Nonlinear Time Series Analysis[M].Britain:Cambridge University Press,2004.42-51.
[15] 张燕平,张铃.机器学习理论与算法[M].北京:科学出版社,2012.
[16] Pulakka H,Laaksonen L.Evaluation of an artificial speech bandwidth extension method in three languages[J].IEEE Transactions on Audio,Speech and Language Processing,2008,16(6):1124-1137.
[17] ITU-R BS.1387-1,Method for Objective Measurements of Perceived Audio Quality[S].2001.