YANG Xu-kui, QU Dan, ZHANG Wen-lin, et al. Adaptive Voice Activity Detection Based on Long-Term Information[J]. Acta Electronica Sinica, 2018, 46(4): 878-885.
YANG Xu-kui, QU Dan, ZHANG Wen-lin, et al. Adaptive Voice Activity Detection Based on Long-Term Information[J]. Acta Electronica Sinica, 2018, 46(4): 878-885. DOI: 10.3969/j.issn.0372-2112.2018.04.016.
The long-term information of speech signals shows excellent performances in the applications of voice activity detection. Six types of long-term information based on auditory filter banks are proposed through the non-linear spectral decomposition with three different auditory filters. Further
an adaptive voice activity detection algorithm based on these types of long-term information is proposed. Without additional training data
this algorithm use the data selecting from the test signals according to long-term information to train a speech/non-speech classifier
and classifies the current test signals using the speech/non-speech classifier frame by frame. Experiments on TIMIT dataset and NOISEX-92 dataset show that the algorithm improves the performance of VAD with higher accuracy and stronger robustness in low SNR noisy environments. The online experiments show that it can also obtain a good performance in real-time processing conditions.