电子学报 ›› 2001, Vol. 29 ›› Issue (4): 456-460.

• 论文 • 上一篇    下一篇

基于码本的说话人自适应方法

吕 津, 赵明生, 王作英   

  1. 清华大学电子工程系,北京 100084
  • 收稿日期:1999-09-10 修回日期:2000-12-04 出版日期:2001-04-25 发布日期:2001-04-25

Codebook-Based Speaker Adaptation

LU Jin, ZHAO Ming-sheng, WANG Zuo-ying   

  1. Department of Electronic Engineering,Tsinghua University,Beijing 100084,China
  • Received:1999-09-10 Revised:2000-12-04 Online:2001-04-25 Published:2001-04-25

摘要: 本文提出了一种基于码本的说话人自适应方法.它可以将变换方法和Bayes估计法这两大类说话人自适应方法的优点有机的结合起来,既能实现快速的说话人自适应,还具有良好的一致渐进性.自适应过程可分为两个阶段:在第一阶段,用由大量参考说话人的语音码本构成的线性组合来逼近用户的语音码本.此时只需要很少的自适应训练数据就可以用基于Rosen梯度投影法的优化算法计算出线性组合中各码本的最佳权值.在第二阶段,码本的最佳线性组合被用作用户码本的先验估计值.随着更多自适应训练数据的获得,系统对用户码本进一步进行Bayes估计,从而可以实现累进的自适应.作者将该方法应用于说话人无关的连续汉语语音识别系统.一系列的对比实验表明该自适应方法很有前途.

关键词: 语音识别, 基于码本的说话人自适应方法, Rosen梯度投影法

Abstract: In this paper,a new speaker adaptation method—codebook-based speaker adaptation,which could combine the advantages of transform method with Bayes adaptive learning method appropriately,is presented.Not only can the speaker adaptation system improve its performance for small amount of adaptation data,but it can also approach asymptotically matched-condition performance with increasing number of adaptation data.The adaptation process can be divided into two stages.In the first stage,for approximating the acoustic parameters of a target speaker,the linear combination of lots of reference speaker's codebooks is proposed.An effective algorithm based on Rosen gradient projection method is developed to count the weight of each codebook in the linear combination.In the second stage,the combination of codebooks is used as the prior probability,then Bayes adaptive learning method is used to learn the exact value of the target speaker's codebook as more adaptation data are gathered.Thus incremental speaker adaptation can be achieved.As an illustration,this method is applied to a speaker independent continuous speech recognition system for the Chinese language.A series of comparative experiments were conducted to evaluate the performance of the proposed method.The results have shown it is quite promising.

Key words: speech recognition, codebook-based speaker adaptation, Rosen gradient projection method

中图分类号: