电子学报 ›› 2020, Vol. 48 ›› Issue (5): 840-845.DOI: 10.3969/j.issn.0372-2112.2020.05.002

• 学术论文 • 上一篇    下一篇

基于语义的汉语普通话电子喉语音转换增强

钱兆鹏1, 肖克晶2, 刘蝉3, 孙悦1   

  1. 1. 北京航空航天大学生物与医学工程学院, 北京 100191;
    2. 中国人民大学信息学院, 北京 100872;
    3. 北京机械设备研究所, 北京 100854
  • 收稿日期:2019-08-05 修回日期:2019-10-08 出版日期:2020-05-25
    • 通讯作者:
    • 钱兆鹏
    • 作者简介:
    • 肖克晶 女,1991年生于河南信阳.现为中国人民大学博士研究生.主要研究方向为自然语言处理,语义分析与文本挖掘.E-mail:xiaokejing0501@163.com
    • 基金资助:
    • 北京市自然科学基金 (No.4194079); 北京航空航天大学虚拟现实国家重点实验室开放课题 (No.VRLAB2018B06); 北京工商大学农产品质量安全追溯技术及应用国家工程实验室开放课题 (No.AQT-2018-YB4)

Voice Conversion for Enhancing Mandarin Electro-Laryngeal Speech Based on Semantic Information

QIAN Zhao-peng1, XIAO Ke-jing2, LIU Chan3, SUN Yue1   

  1. 1. School of Biological Science&Medical Engineering, Beihang University, Beijing 100191, China;
    2. Information School, Renmin University of China, Beijing 100872, China;
    3. Beijing Mechanical Equipment Institute, Beijing 100854, China
  • Received:2019-08-05 Revised:2019-10-08 Online:2020-05-25 Published:2020-05-25

摘要: 电子喉语音存在基频单一、发声机械、辐射噪声大等多种缺陷,这严重影响了电子喉语音可懂度和自然度,特别是对汉语普通话之类的声调语言,问题尤其严重.汉语普通话电子喉语音识别存在辅音混淆的问题并且识别结果没有声调,因此本文在识别结果的基础之上设计了拼音拼写修正器和声调标注工具,再结合基于Tacotron-2的TTS实现了电子喉语音向正常语音的转换.客观评价实验结果表明,拼音拼写修正器可以提高拼音准确率,声调标注在有上下文的语义环境中具有较高准确率.主观听力测试结果表明,本文所提方法在不同语言水平上提高了汉语普通话电子喉语音的可懂度和自然度.研究结果表明,本文设计的方法可以将不带声调的电子喉语音转换为正常语音,相比于传统语音转换方法具有更高的性能.

关键词: 电子喉语音, 拼音修正, 拼音声调标注, 语音转换

Abstract: The Electro-Laryngeal (EL) speech has some drawbacks such as single fundamental frequency,mechanical sound and large radiation noise.The drawbacks affect the intelligibility and naturalness of the EL speech.Especially,the tonal language such as Mandarin EL speech would be worse understanding.In this paper,the spelling corrector for pinyin and the tone labelling tool are designed to solve the problems that Mandarin EL speech recognition has some errors in consonants and the recognition result has no tone.The result is synthesized into the healthy speech by TTS based on Tacotron-2.The objective evaluation results show that the accuracy of pinyin spelling corrector has been improved; the accuracy of tone labelling under contextual environment is very high.The subjective results shows the proposed method can improve the intelligibility and naturalness of the EL speech a lot.The results illustrate that the proposed method can convert the EL speech without tone into the healthy speech.And the proposed method performs better than the traditional method based on speech signal processing.

Key words: electro-laryngeal speech, pinyin spelling corrector, pinyin tone labelling, voice conversion

中图分类号: