Program of National Key Research and Development Program of China (No.2017YFB1402102);National Natural Science Foundation of China (No.11872036, No.11502133, No.11772178);Key Research and Development Program of Shaanxi Province (No.2019GY-217, No.2019ZDLSF07-01)
WU Xia, WU Xiao-jun, SHI Su-zhen, et al. Research on DUPSO-RPSOVF Speech Prediction Model with Hidden Phase Space[J]. Acta Electronica Sinica, 2019, 47(9): 1875-1882.
DOI:
WU Xia, WU Xiao-jun, SHI Su-zhen, et al. Research on DUPSO-RPSOVF Speech Prediction Model with Hidden Phase Space[J]. Acta Electronica Sinica, 2019, 47(9): 1875-1882. DOI: 10.3969/j.issn.0372-2112.2019.09.009.
Research on DUPSO-RPSOVF Speech Prediction Model with Hidden Phase Space
提出了一种基于二阶Volterra级数的语音信号非线性预测模型.为克服传统的最小均方(Least Mean Square,LMS)算法在模型核系数更新时的固有缺点,引入耗散均匀搜索粒子群优化算法(Dissipative Uniform Particle Swarm Optimization,DUPSO)求解核系数,并构建了DUPSO-SOVF预测模型;为避免传统方法中相空间的重构过程,构建了隐相空间DUPSO-SOVF预测模型,在求解模型核系数时动态地求解出最优嵌入维数和延迟时间;为降低模型复杂度,在误差允许范围内进行模型关键项的提取,从而减少了核系数个数,构建了少参数的DUPSO-RPSOVF(Reduced Parameter SOVF,RPSOVF)预测模型.将英语音素、单词和短语作为实验样本数据进行仿真,结果表明:隐相空间DUPSO-SOVF模型能够准确的计算出相空间重构参数,DUPSO-SOVF和DUPSO-RPSOVF两种预测模型对单帧和多帧语音信号均具有较高的预测精度,优于PSO-SOVF和LMS-SOVF预测模型,并且能够很好地反映语音序列变化的趋势和规律,可以满足语音序列预测的要求.
Abstract
A type of nonlinear prediction model for speech signals based on second-order Volterra series is put forward. In order to overcome some intrinsic shortcomings caused by using the classic least mean square (LMS) algorithm to update Volterra model kernel coefficients
a dissipative uniform particle swarm optimization (DUPSO) algorithm is applied to obtain the kernel coefficients and then a DUPSO-SOVF prediction model can be constructed. A DUPSO-SOVF prediction model with hidden phase space is constructed by dynamically obtaining parameters of embedding dimension and time delay in the process of solving model kernel coefficients rather than using traditional phase space reconstruction process.On the purpose to reduce model complexity
the key model kernels are extracted within the margin of the allowable error and the model kernels are then reduced
and the reduced parameter DUPSO-SOVF (RPSOVF) prediction model is proposed. Simulation results for samples of English phonemes
words and phrases show that
the DUPSO-SOVF model with hidden phase space can accurately calculate parameters of embedding dimension and delay time of phase space reconstruction; both of the DUPSO-SOVF model and the DUPSO-RPSOVF model exhibit higher prediction accuracy on single frame and multi-frame speech signal than PSO-SOVF and LMS-SOVF models. Also
the two proposed models can better reflect trends and regularities of the speech signal series and meet requirements for speech signal prediction.