语音识别隐马尔可夫模型的改进

战普明; 王作英; 陆大

您当前的位置：

首页 >

文章列表页 >

语音识别隐马尔可夫模型的改进

更新时间：2025-12-08

- 语音识别隐马尔可夫模型的改进
- Improvement of Hidden Markov Model for Speech Recognition
- 电子学报 1994年第1期
- 作者机构：
  
  清华大学电子工程系
- 作者简介：
- 基金信息：
- DOI：
  中图分类号： TN912.34
- 纸质出版：1994
- 稿件说明：
移动端阅览
[1]战普明,王作英,陆大　.语音识别隐马尔可夫模型的改进[J].电子学报,1994(01):9-15.

战普明, 王作英, 陆大. Improvement of Hidden Markov Model for Speech Recognition[J]. Acta Electronica Sinica, 1994, (1).
[1]战普明,王作英,陆大　.语音识别隐马尔可夫模型的改进[J].电子学报,1994(01):9-15. DOI：

战普明, 王作英, 陆大. Improvement of Hidden Markov Model for Speech Recognition[J]. Acta Electronica Sinica, 1994, (1). DOI：

摘要

由于在语音识别中被广泛应用的隐马尔可夫模型（ＨＭＭ）是一重马尔可夫模型，它不能充分地描述语音信号的时间相依性。虽然理论上可将ＨＭＭ扩展成多重马尔可夫模型，但由于所需运算量和存储量将成指数增长而使其难以应用。因此，本文提出一种新模型，它是由ＨＭＭ与一个能描述语音信号时间相依性的多维高斯密度函数相结合构成的，本文从理论上论证了新模型的合理性。对汉语不计声调的全部４０９个单音节的识别实验结果表明：新模型的识别率显著一致地高于ＨＭＭ．此外，本文使用平滑的统计直方图描述状态的持续时间长度，因为我们在实验中发现，连续的密度函数，例如高斯、Ｇａｍｍａ等，不能令人满意地描述ＨＭＭ或本文新模型的状态持续时间。

Abstract

Since the widely used Hidden Markov Model（HMM）in speech recognition is first order Markov Model

it can not fully model the temporal dependence of speech signal. Although HMM can be extended to higher order Markov Model theoretically

the exponential increase of required computation and memory makes it difficult to use.Therefore

a new model is proposed in this paper

it is constructed by combining HMM with a multi-variable Gaussian density which can depict the temporal dependence of speech signal. The reasonableness of the new model is discussed theoreically. The experiment for all Chinese syllables with tone disregarded（total of 409 syllables）recognition shows that recognition rate of the new model is always significantly better than that of HMM.Furthermore

a discrete smoothed statisical histogram is used to model the state duration

because we found in the experiment that continuous density function

such as Gaussian

Gamma etc.

can not satisfactorily depict the state duration of either HMM or the new model.

关键词

Keywords

references

浏览量

631

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

复杂噪声环境下基于轻量化模型的车内交互语音增强和识别方法

融合语言模型的端到端中文语音识别算法

半导体神经计算机的硬件实现及其在连续语音识别中的应用

全自动中文新闻字幕生成系统的设计与实现

多权值神经元网络仿生模式识别方法在低训练样本数量非特定人语音识别中与HMM及DTW的比较研究