A Deep Convolutional Neural Network Based Speech Enhancement Approach Incorporating Phase Estimation

YUAN Wen-hao; LIANG Chun-yan; XIA Bin; SUN Wen-zhu

doi:10.3969/j.issn.0372-2112.2018.10.008

您当前的位置：

首页 >

文章列表页 >

A Deep Convolutional Neural Network Based Speech Enhancement Approach Incorporating Phase Estimation

更新时间：2025-07-16

- A Deep Convolutional Neural Network Based Speech Enhancement Approach Incorporating Phase Estimation
- Acta Electronica Sinica Vol. 46, Issue 10, Pages: 2359-2366(2018)
- 作者机构：
  
  山东理工大学计算机科学与技术学院,山东,淄博,255000
- 作者简介：
- 基金信息：
- DOI：10.3969/j.issn.0372-2112.2018.10.008
  CLC： TN912.3
- Published：2018
- 稿件说明：
移动端阅览
YUAN Wen-hao, LIANG Chun-yan, XIA Bin, et al. A Deep Convolutional Neural Network Based Speech Enhancement Approach Incorporating Phase Estimation[J]. Acta Electronica Sinica, 2018, 46(10): 2359-2366.
DOI：

YUAN Wen-hao, LIANG Chun-yan, XIA Bin, et al. A Deep Convolutional Neural Network Based Speech Enhancement Approach Incorporating Phase Estimation[J]. Acta Electronica Sinica, 2018, 46(10): 2359-2366. DOI： 10.3969/j.issn.0372-2112.2018.10.008.

摘要

在时频域的语音增强中，幅度估计和相位估计都是影响语音增强性能的重要因素.为了在基于深度学习的语音增强方法中融合对相位的估计，本文将含噪语音短时傅里叶变换（STFT）的实部和虚部特征作为两个通道输入深度卷积神经网络，通过建立一个同步估计纯净语音STFT的实部和虚部特征的多任务学习模型，实现了对幅度和相位的同步估计.实验结果表明，相比仅考虑幅度估计的方法，本文方法具有更好的噪声抑制能力，在低信噪比条件下，显著提高了语音增强性能.

Abstract

In the speech enhancement of the time-frequency domain

both the amplitude estimation and the phase estimation are the important factors that affect speech enhancement performance. In order to incorporate the phase estimation into the speech enhancement approaches based on deep learning

the real and imaginary part of the short-time Fourier transform (STFT) of noisy speech are treated as two channels and fed into the deep convolutional neural network (DCNN) in this paper. By establishing a multi-task learning model which simultaneously estimates the real and imaginary part of the STFT of clean speech

the synchronous estimation of the amplitude and phase is achieved. Experimental results show that compared with the approaches only considering the amplitude estimation

the proposed approach has better noise suppression ability

and improves speech enhancement performance significantly under the condition of low SNR.

关键词

Keywords

references

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Ultrafast Ultrasound Color Blood Flow Imaging Based on the DCNN

An In-Vehicle Interaction Speech Enhancement and Recognition Method Based on Lightweight Models in Complex Environment

Suppression Method of the Interference Sound Sources by Estimated Steering Vector Based on the Focusing Signal Subspace

Real-Time Semantic Segmentation for Road Scene Based on Data Enhancement and Dual-Path Fusion Network

Related Author

LI Hai-yan

WANG Ting-ting

ZOU Liang-chen

HE Bing-bing

CUI Wang

XIA Nan

YANG Hong-qin

DAI Gao-le

Related Institution

Department of Electronic Engineering, School of Information, Yunnan University

School of Information Science and Engineering， Dalian Ploytechnic University

Speech and Audio Signal Processing Laboratory, Faculty of Information Technology, Beijing University of Technology

School of Information Science and Engineering, Yanshan University

Deptment of Electronic Engineering, School of Information, Yunnan University

⁰