基于码本的说话人自适应方法

吕 津; 赵明生; 王作英

您当前的位置：

首页 >

文章列表页 >

基于码本的说话人自适应方法

论文 | 更新时间：2025-07-16

- 基于码本的说话人自适应方法
- Codebook-Based Speaker Adaptation
- 电子学报 2001年29卷第4期页码：456-460
- 作者机构：
  
  清华大学电子工程系,北京,100084
- 作者简介：
- 基金信息：
- DOI：
  中图分类号： TN912.34
- 纸质出版：2001
- 稿件说明：
移动端阅览
吕津, 赵明生, 王作英. 基于码本的说话人自适应方法[J]. 电子学报, 2001,29(4):456-460.

LU Jin, ZHAO Ming-sheng, WANG Zuo-ying. Codebook-Based Speaker Adaptation[J]. Acta Electronica Sinica, 2001, 29(4): 456-460.
吕津, 赵明生, 王作英. 基于码本的说话人自适应方法[J]. 电子学报, 2001,29(4):456-460. DOI：

LU Jin, ZHAO Ming-sheng, WANG Zuo-ying. Codebook-Based Speaker Adaptation[J]. Acta Electronica Sinica, 2001, 29(4): 456-460. DOI：

摘要

本文提出了一种基于码本的说话人自适应方法.它可以将变换方法和Bayes估计法这两大类说话人自适应方法的优点有机的结合起来

既能实现快速的说话人自适应

还具有良好的一致渐进性.自适应过程可分为两个阶段:在第一阶段

用由大量参考说话人的语音码本构成的线性组合来逼近用户的语音码本.此时只需要很少的自适应训练数据就可以用基于Rosen梯度投影法的优化算法计算出线性组合中各码本的最佳权值.在第二阶段

码本的最佳线性组合被用作用户码本的先验估计值.随着更多自适应训练数据的获得

系统对用户码本进一步进行Bayes估计

从而可以实现累进的自适应.作者将该方法应用于说话人无关的连续汉语语音识别系统.一系列的对比实验表明该自适应方法很有前途.

Abstract

In this paper

a new speaker adaptation method—codebook-based speaker adaptation

which could combine the advantages of transform method with Bayes adaptive learning method appropriately

is presented.Not only can the speaker adaptation system improve its performance for small amount of adaptation data

but it can also approach asymptotically matched-condition performance with increasing number of adaptation data.The adaptation process can be divided into two stages.In the first stage

for approximating the acoustic parameters of a target speaker

the linear combination of lots of reference speaker's codebooks is proposed.An effective algorithm based on Rosen gradient projection method is developed to count the weight of each codebook in the linear combination.In the second stage

the combination of codebooks is used as the prior probability

then Bayes adaptive learning method is used to learn the exact value of the target speaker's codebook as more adaptation data are gathered.Thus incremental speaker adaptation can be achieved.As an illustration

this method is applied to a speaker independent continuous speech recognition system for the Chinese language.A series of comparative experiments were conducted to evaluate the performance of the proposed method.The results have shown it is quite promising.

关键词

Keywords

references

浏览量

1001

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

复杂噪声环境下基于轻量化模型的车内交互语音增强和识别方法

融合语言模型的端到端中文语音识别算法

半导体神经计算机的硬件实现及其在连续语音识别中的应用

全自动中文新闻字幕生成系统的设计与实现

多权值神经元网络仿生模式识别方法在低训练样本数量非特定人语音识别中与HMM及DTW的比较研究