电子学报 ›› 2019, Vol. 47 ›› Issue (4): 791-797.DOI: 10.3969/j.issn.0372-2112.2019.04.004

• 学术论文 • 上一篇    下一篇

利用生成噪声提高语音增强方法的泛化能力

袁文浩, 娄迎曦, 梁春燕, 夏斌   

  1. 山东理工大学计算机科学与技术学院, 山东淄博 255000
  • 收稿日期:2018-02-27 修回日期:2018-07-24 出版日期:2019-04-25
    • 通讯作者:
    • 袁文浩
    • 作者简介:
    • 娄迎曦 女,1996年出生,山东聊城人.2018年毕业于青岛理工大学琴岛学院,现为山东理工大学计算机科学与技术学院硕士研究生.主要研究方向为语音信号处理、语音增强.E-mail:1804224373@qq.com
    • 基金资助:
    • 国家自然科学基金 (No.61701286,No.11704229); 山东省自然科学基金 (No.ZR2015FL003,No.ZR2017MF047,No.ZR2017LA011)

Improving Generalization Ability of Speech Enhancement Approaches Using Generated Noise

YUAN Wen-hao, LOU Ying-xi, LIANG Chun-yan, XIA Bin   

  1. College of Computer Science and Technology, Shandong University of Technology, Zibo, Shandong 255000, China
  • Received:2018-02-27 Revised:2018-07-24 Online:2019-04-25 Published:2019-04-25
    • Supported by:
    • National Natural Science Foundation of China (No.61701286, No.11704229); Natural Science Foundation of Shandong Province,  China (No.ZR2015FL003, No.ZR2017MF047, No.ZR2017LA011)

摘要: 如何提高对未知噪声类型的泛化能力是有监督语音增强方法中亟待解决的重要问题,通过对大量不同类型噪声进行建模,深度神经网络成为了解决该问题的有效手段.为了进一步提高基于深度神经网络的语音增强方法的泛化能力,本文基于生成式对抗网络(Generative Adversarial Networks,GAN)设计了能够由真实噪声数据生成新的噪声类型的NoiseGAN;通过在训练集中增加生成噪声类型,提高训练集噪声类型的多样性,从而达到提高语音增强模型泛化能力的目的.不同结构的网络下的语音增强实验结果表明,本文提出的NoiseGAN能够生成新的噪声类型,具备提高训练集噪声类型多样性的能力,有效提高了语音增强模型在未知噪声类型下的泛化能力.

关键词: 语音增强, 生成式对抗网络, 泛化能力, 深度神经网络

Abstract: How to improve the generalization ability to unknown noise types is an important problem to be solved urgently in supervised speech enhancement approaches.By modeling a large number of types of noise,the deep neural network(DNN)becomes an effective way to solve this problem.In order to further improve the generalization ability of speech enhancement approaches based on DNN,this paper designs NoiseGAN based on Generative Adversarial Networks (GAN) to generate new noise types from real noise data.By adding generated noise to training set,the diversity of noise types in training set is increased,and thereby the generalization ability of speech enhancement model is improved.The results of speech enhancement experiments under different structures of networks show that the proposed NoiseGAN can generate new noise types,increase the diversity of noise types in training set,and effectively improve the generalization ability of speech enhancement models under unknown noise types.

Key words: speech enhancement, generative adversarial networks, generalization ability, deep neural network

中图分类号: