Key Program of Science and Technology Development Plan of Beijing Municipality Education Commission (No.KZ201110005005);National Natural Science Foundation of China (No.61072089);Talents training program of higher education institutions in Beijing Municipality
A Gaussian Mixture Model (GMM) based speech enhancement method in compressed domain used for ITU-T G.722.2 wideband speech codec is proposed to take full advantage of the prior knowledge of the Immittance Spectral Frequencies (ISFs) for the clean speech.Firstly
GMM is adopted to model the joint probability density of feature vectors which are composed by the ISFs of noisy speech and clean speech with the corresponding gain scaling factor.Secondly
an optimal Bayesian estimation of feature parameters derived from clean speech is obtained under the minimum mean square error (MMSE) criterion.To be compatible with the DTX (Discontinuous Transmission) mode
the logarithmic energy is attenuated and the ISFs remain when a SID (Silence Insertion Descriptor) frame is received.Furthermore
if ao erased frame is received
the bit stream is unchanged and the proposed method is performed on the recovered parameters for the memory update.The evaluation is conducted under the ITU-T G.160.The results indicate that
comparing with the reference method
the proposed method can produce larger amount of noise level reduction with better objective speech quality