基于Volterra级数预测的音频频带扩展

PDF(1617 KB)

电子学报 ›› 2012, Vol. 40 ›› Issue (12) : 2501-2506.

学术论文

基于Volterra级数预测的音频频带扩展

张兴涛, 鲍长春, 刘鑫, 张丽燕

作者信息 +

Audio Bandwidth Extension Based on Volterra Series

ZHANG Xing-tao, BAO Chang-chun, LIU Xin, ZHANG Li-yan

Author information +

文章历史 +

摘要

本文采用非线性分析方法,基于Volterra级数提出了一种宽带音频信号的频带扩展方法,并利用高斯混合模型(Gaussian Mixture Model,GMM)和码本映射技术对扩展后的音频信号进行了谱包络和能量增益调整.实验表明,所提算法的性能要好于已有的非线性频带扩展算法,当用本文的方法替代ITU-T G.722.1C编码器中的噪声填充技术时,在24kbps得到了提升的超宽带音频质量.

Abstract

In this paper,a bandwidth extension algorithm of wideband audio signal is proposed based on Volterra series with the nonlinear analysis method.The Gaussian mixture model and codebook mapping algorithms are used to adjust the spectrum envelope and energy gain of the extended audio signal separately.Test results indicate that the proposed method outperforms the existing nonlinear algorithms.When the noise-filling method used in ITU-T G.722.1C super-wideband audio codec is replaced by the proposed algorithm,the super-wideband audio quality is improved at 24 kbps.

导出引用

张兴涛, 鲍长春, 刘鑫, 张丽燕. 基于Volterra级数预测的音频频带扩展[J]. 电子学报, 2012, 40(12): 2501-2506.

ZHANG Xing-tao, BAO Chang-chun, LIU Xin, ZHANG Li-yan. Audio Bandwidth Extension Based on Volterra Series[J]. Acta Electronica Sinica, 2012, 40(12): 2501-2506.

中图分类号： TN912.3

参考文献

[1] 沙永涛,鲍长春,贾懋.一种基于重构八度音的音频信号高频重建方法[J].信号处理,2009,25(8A):139-142. Sha Yong-tao,Bao Chang-chun,Jia Mao-shen.A method of high frequencies reconstruction of audio signal based on reconstructed octave [J].Signal Processing,2009,25 (8A): 139-142.(in Chinese)
[2] Xin Liu,Chang-chun Bao,Mao-shen Jia,Yong-tao Sha.A harmonic bandwidth extension based on Gaussian mixture model .Proceeding of 10th International Conference on Signal Processing (ICSP2010) .Beijing:IEEE Press,2010.474-477.
[3] 鲍长春.数字语音编码原理[M].西安: 西安电子科技大学出版社,2007. Bao Chang-chun.Principles of Digital Speech Coding [M].Xi’an: Xidian University Press,2007.(in Chinese)
[4] 韩敏.混沌时间序列预测理论和方法[M].北京: 中国水利出版社,2007.155-160. Han Min.Prediction Theory and Method of Chaotic Time Series [M].Beijing: China Water Power Press,2007.155-160.(in Chinese)
[5] Holger Kantz,Thomas Schreiber.Nonlinear Time Series Analysis [M].Britain: Cambridge University Press,2004.42-51.
[6] Yong-tao Sha,Chang-chun Bao,Mao-shen Jia,Xin Liu.High frequency reconstruction of audio signal based on chaotic prediction theory .Proceeding of IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 2010) .Dallas,Texas,USA: IEEE Press,2010.381-384.
[7] Xin Liu,Chang-chun Bao,Mao-shen Jia,Yong-tao Sha.Nonlinear bandwidth extension based on nearest-neighbor matching .Processing of the Second APSIPA (Asia-Pacific Signal and Information Processing Association) Annual Summit and Conference (ASC) .Biopolis,Singapore: APSIPA Press ,2010.169-172.
[8] 张勇,胡瑞敏.基于高斯混合模型的语音频带扩展所发的研究[J].声学学报,2009,34(5): 471-480. Zhang Yong,Hu Rui-min.Speech wideband extension based on Gaussian Mixture Model [J].Acta Acustica,2009,34(5): 471-480.(in Chinese)
[9] Xiao-ke Xu,Xiao-ming Liu,Xiao-nan Chen.The Cao method for determining the minimum embedding dimension of sea clutter .Proceedings of 2006 CIE International Conference on Radar .Shanghai: IEEE Press,2006.77-80.
[10] 王海燕,卢山.非线性时间序列分析及其应用[M].北京: 科学出版社,2006.10-12. Wang Hai-yan,Lu Shan.Nonlinear Time Series Analysis and Its Application [M].Beijing: Science Press,2006.10-12.(in Chinese)
[11] L M Li,S A Billings.Analysis of nonlinear oscillators using Volterra series in the frequency domain [J].Journal of Sound and Vibration,2011,330: 337-355.
[12] 张家树.混沌信号的非线性自适应预测技术及其应用研究 .成都: 电子科技大学,2001.121-125. Zhang Jia-shu.Nonlinear Adaptive Prediction Technologies of Chaotic Signals and Its Applications .Chengdu: University of Electronic Science and Technology of China,2001.121-125.(in Chinese)
[13] 岳毅宏,韩文秀,等.基于关联度的混沌序列局域加权线性回归预测法[J].中国电机工程学报,2004,24(11): 17-20. Xue Yi-hong,Han Wen-xiu,et al.Local adding-weight linear regression forecasting method of chaotic series based on degree of incidence [J].Proceedings of the Chinese Society for Electronic Engineer,2004,24(11): 17-20.(in Chinese)
[14] V J Mathews.Adaptive polynomial filters [J].IEEE Signal Processing Magazine,1991,8(3): 10-26.
[15] 刘鑫.宽带音频的非线性频带展宽技术 .北京: 北京工业大学电子信息与控制工程学院,2011.53-64. Liu Xin.Nonlinear Bandwidth Extension for Wideband Audio .Beijing: Beijing University of Technology,2011.53-64.(in Chinese)
[16] ITU-T G.722.1 Annex C, Low Complexity Coding at 24 and 32 kb/s for Hand-free Operation in System with Low Frame Loss Annex C 14kHz Mode at 24,32 and 48 kb/s [S].2005.
[17] ITU-R Rec.BS.1387-1,Method for Objective Measurements of Perceived Audio Quality [S].2001.