A Convolutional Gated Recurrent Network for Speech Enhancement

doi:10.3969/j.issn.0372-2112.2020.07.005

您当前的位置：

首页 >

文章列表页 >

A Convolutional Gated Recurrent Network for Speech Enhancement

更新时间：2025-07-08

- A Convolutional Gated Recurrent Network for Speech Enhancement
- Acta Electronica Sinica Vol. 48, Issue 7, Pages: 1276-1283(2020)
- 作者机构：
  
  山东理工大学计算机科学与技术学院,山东,淄博,255000
- 作者简介：
- 基金信息：
- DOI：10.3969/j.issn.0372-2112.2020.07.005
  CLC： TN912.3
- Published Online：25 July 2020，
  
  Published：2020
- 稿件说明：
移动端阅览
A Convolutional Gated Recurrent Network for Speech Enhancement[J]. Acta Electronica Sinica, 2020, 48(7): 1276-1283.
DOI：

A Convolutional Gated Recurrent Network for Speech Enhancement[J]. Acta Electronica Sinica, 2020, 48(7): 1276-1283. DOI： 10.3969/j.issn.0372-2112.2020.07.005.

摘要

为了充分利用含噪语音特征来提高语音增强网络的性能，基于含噪语音在时间和频率两个维度上的相关性，本文结合卷积神经网络的局部特征提取能力和门控循环单元的长期依赖建模能力，设计了一种适用于语音增强的卷积门控循环网络.该网络采用卷积网络结构代替全连接网络结构来改进门控循环单元中的特征计算过程，从而能够更好地保留含噪语音特征中的时频结构信息.实验结果表明，与其它语音增强网络相比，本文网络在语音成分的保留和噪声成分的抑制上具有明显优势，增强后语音具有更好的语音质量和可懂度.

Abstract

In order to improve the performance of speech enhancement networks by making full use of noisy speech features

based on the correlation of noisy speech in time and frequency

by combining the local feature extraction ability of convolutional neural networks and the long-term dependence modeling ability of gated recurrent unit

a convolutional gated recurrent network suitable for speech enhancement is designed in this paper. This network uses a convolutional network structure instead of a fully connected network structure to improve the feature calculation process in the gated recurrent unit

thereby can better retain the time-frequency structure in the noisy speech features. The experimental results show that compared with other speech enhancement networks

the proposed network has obvious advantages in speech component retention and noise component suppression

and the enhanced speech has better speech quality and intelligibility.

关键词

Keywords

references

Views

130

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Improving Generalization Ability of Speech Enhancement Approaches Using Generated Noise

Speech Enhancement Method Based on Binaural Cues Coding Principle

Operator Fusion Method and Hardware Architecture Design Based on Non-Standard Operators

Adversarial Learning and Enhanced Optimization Based Restoration Method for VC-Generated Speeches

Binary Code Similarity Detection Method Based on Cross-Modal Coordinated Representation Learning

Related Author

LOU Ying-xi

XIA Bin

CHEN Nan

BAO Chang-chun

WANG Ying

GAO Lan

ZHANG Zhe

LIU Xin

Related Institution

Faculty of Information Technology, Beijing University of Technology

College of Information Engineering, Capital Normal University

School of Mathematical Science, Capital Normal University

Faculty of Software Technologics, Shanxi Agricultural University

School of Computer and Information Technology, Hefei University of Technology

⁰