电子学报 ›› 2020, Vol. 48 ›› Issue (7): 1276-1283.DOI: 10.3969/j.issn.0372-2112.2020.07.005

• 学术论文 • 上一篇    下一篇

一种用于语音增强的卷积门控循环网络

袁文浩, 胡少东, 时云龙, 李钊, 梁春燕   

  1. 山东理工大学计算机科学与技术学院, 山东淄博 255000
  • 收稿日期:2019-08-02 修回日期:2020-03-30 出版日期:2020-07-25
    • 通讯作者:
    • 袁文浩
    • 作者简介:
    • 胡少东 男,1996年出生,山东泰安人.2019年毕业于山东理工大学,现为山东理工大学计算机科学与技术学院硕士研究生.主要研究方向为语音信号处理,语音增强.E-mail:1764513896@qq.com
    • 基金资助:
    • 国家自然科学基金 (No.61701286,No.11704229); 山东省自然科学基金 (No.ZR2018LF002); 山东省高等学校青年创新团队发展计划 (No.2019KJN048)

A Convolutional Gated Recurrent Network for Speech Enhancement

YUAN Wen-hao, HU Shao-dong, SHI Yun-long, LI Zhao, LIANG Chun-yan   

  1. College of Computer Science and Technology, Shandong University of Technology, Zibo, Shandong 255000, China
  • Received:2019-08-02 Revised:2020-03-30 Online:2020-07-25 Published:2020-07-25

摘要: 为了充分利用含噪语音特征来提高语音增强网络的性能,基于含噪语音在时间和频率两个维度上的相关性,本文结合卷积神经网络的局部特征提取能力和门控循环单元的长期依赖建模能力,设计了一种适用于语音增强的卷积门控循环网络.该网络采用卷积网络结构代替全连接网络结构来改进门控循环单元中的特征计算过程,从而能够更好地保留含噪语音特征中的时频结构信息.实验结果表明,与其它语音增强网络相比,本文网络在语音成分的保留和噪声成分的抑制上具有明显优势,增强后语音具有更好的语音质量和可懂度.

关键词: 语音增强, 深度神经网络, 门控循环单元, 卷积神经网络

Abstract: In order to improve the performance of speech enhancement networks by making full use of noisy speech features,based on the correlation of noisy speech in time and frequency,by combining the local feature extraction ability of convolutional neural networks and the long-term dependence modeling ability of gated recurrent unit,a convolutional gated recurrent network suitable for speech enhancement is designed in this paper.This network uses a convolutional network structure instead of a fully connected network structure to improve the feature calculation process in the gated recurrent unit,thereby can better retain the time-frequency structure in the noisy speech features.The experimental results show that compared with other speech enhancement networks,the proposed network has obvious advantages in speech component retention and noise component suppression,and the enhanced speech has better speech quality and intelligibility.

Key words: speech enhancement, deep neural network, gated recurrent unit, convolutional neural network

中图分类号: