1. 北京航空航天大学生物与医学工程学院,北京,100191
2. 北京机械设备研究所,北京,100854
4. 中国人民大学信息学院,北京,100872
网络出版:2020-05-25,
纸质出版:2020
移动端阅览
钱兆鹏, 肖克晶, 刘蝉, 等. 基于语义的汉语普通话电子喉语音转换增强[J]. 电子学报, 2020,48(5):840-845.
QIAN Zhao-peng, XIAO Ke-jing, LIU Chan, et al. Voice Conversion for Enhancing Mandarin Electro-Laryngeal Speech Based on Semantic Information[J]. Acta Electronica Sinica, 2020, 48(5): 840-845.
钱兆鹏, 肖克晶, 刘蝉, 等. 基于语义的汉语普通话电子喉语音转换增强[J]. 电子学报, 2020,48(5):840-845. DOI: 10.3969/j.issn.0372-2112.2020.05.002.
QIAN Zhao-peng, XIAO Ke-jing, LIU Chan, et al. Voice Conversion for Enhancing Mandarin Electro-Laryngeal Speech Based on Semantic Information[J]. Acta Electronica Sinica, 2020, 48(5): 840-845. DOI: 10.3969/j.issn.0372-2112.2020.05.002.
电子喉语音存在基频单一、发声机械、辐射噪声大等多种缺陷,这严重影响了电子喉语音可懂度和自然度,特别是对汉语普通话之类的声调语言,问题尤其严重.汉语普通话电子喉语音识别存在辅音混淆的问题并且识别结果没有声调,因此本文在识别结果的基础之上设计了拼音拼写修正器和声调标注工具,再结合基于Tacotron-2的TTS实现了电子喉语音向正常语音的转换.客观评价实验结果表明,拼音拼写修正器可以提高拼音准确率,声调标注在有上下文的语义环境中具有较高准确率.主观听力测试结果表明,本文所提方法在不同语言水平上提高了汉语普通话电子喉语音的可懂度和自然度.研究结果表明,本文设计的方法可以将不带声调的电子喉语音转换为正常语音,相比于传统语音转换方法具有更高的性能.
The Electro-Laryngeal (EL) speech has some drawbacks such as single fundamental frequency
mechanical sound and large radiation noise. The drawbacks affect the intelligibility and naturalness of the EL speech. Especially
the tonal language such as Mandarin EL speech would be worse understanding. In this paper
the spelling corrector for pinyin and the tone labelling tool are designed to solve the problems that Mandarin EL speech recognition has some errors in consonants and the recognition result has no tone. The result is synthesized into the healthy speech by TTS based on Tacotron-2. The objective evaluation results show that the accuracy of pinyin spelling corrector has been improved; the accuracy of tone labelling under contextual environment is very high. The subjective results shows the proposed method can improve the intelligibility and naturalness of the EL speech a lot. The results illustrate that the proposed method can convert the EL speech without tone into the healthy speech. And the proposed method performs better than the traditional method based on speech signal processing.
0
浏览量
107
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621