电子学报 ›› 2013, Vol. 41 ›› Issue (2): 295-300.DOI: 10.3969/j.issn.0372-2112.2013.02.014

• 学术论文 • 上一篇    下一篇

利用抗噪幂归一化倒谱系数的鸟类声音识别

颜鑫, 李应   

  1. 福州大学数学与计算机科学学院,福建福州 350108
  • 收稿日期:2012-05-17 修回日期:2012-09-28 出版日期:2013-02-25 发布日期:2013-02-25
  • 作者简介:颜 鑫 男,1988年出生于福建省泉州市.现为福州大学数学与计算机科学学院硕士研究生,主要研究方向为声音识别. E-mail:yanxin124@126.com 李 应 男,1964年出生于福建省闽清县.现为福州大学数学与计算机科学学院教授,主要研究方向为环境声音识别、信息安全.E-mail:fj_liying@fzu.edu.cn
  • 基金资助:
    国家自然科学基金(No.61075022)

Anti-Noise Power Normalized Cepstral Coefficients in Bird Sounds Recognition

YAN Xin, LI Ying   

  1. College of Mathematics and Computer Science, Fuzhou University, Fuzhou, Fujian 350108, China
  • Received:2012-05-17 Revised:2012-09-28 Online:2013-02-25 Published:2013-02-25

摘要: 针对真实环境中各种背景噪声下的鸟类声音识别问题,提出了一种基于新型抗噪特征提取的鸟类声音识别技术.首先,根据适用于高度非平稳环境下的噪声估计算法求出噪声功率谱.其次,使用多频带谱减法对声音功率谱进行降噪处理.接着,结合降噪的声音功率谱提取抗噪幂归一化倒谱系数(APNCC).最后,采用支持向量机(SVM)分别对提取的APNCC,幂归一化倒谱系数(PNCC)和Mel频率倒谱系数(MFCC)对34种鸟类声音进行不同环境和信噪比情况下的对比实验.实验表明,提取的APNCC具有较好的平均识别效果及较强的噪声鲁棒性,更适用于信噪比低于30dB环境下的鸟类声音识别.

关键词: 鸟类声音识别, 非平稳噪声估计, 多频带谱减法, 抗噪幂归一化倒谱系数, Mel频率倒谱系数

Abstract: In order to improve the accuracy of bird sounds recognition under different kinds of noise environments in the real world,a new bird sounds recognition technology based on the APNCC extraction was proposed.First,the noise estimation algorithm for highly non-stationary environments was used to estimate the noise power spectrum of the bird sound in the noise environment.Second,the multi-band spectral subtraction was presented to achieve the background noise reduction.Then,the estimated clean bird sound spectrum was combined with the process of the PNCC extraction to calculate the APNCC.Finally,the comparison experiments of 34 bird sounds recognition in 3 different real environments under different SNRs were constructed,based on the combination of the SVM classifier and 3 different features,namely the APNCC,PNCC and MFCC.The experimental results show that the APNCC outperforms other features in the average recognition rate and the noise robustness,especially for the conditions of all SNRs lower than 30dB.

Key words: bird sounds recognition, non-stationary noise estimation, multi-band spectral subtraction, anti-noise power normalized cepstral coefficients (APNCC), Mel-frequency cepstral coefficients (MFCC)

中图分类号: