Font Size: a A A

The Research Of Voice Activity Detection Based On Long-term Features

Posted on:2015-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:L FengFull Text:PDF
GTID:2428330488499486Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Accurate Voice Activity Detection(VAD)can improve the accuracy and efficiency of post-processing of speech,and it also can provide the basis for speech segments.At present,there have been a lot of methods of VAD.These methods have good performance of VAD in high signal-to-noise ratio(SNR)and stationary noise,but the performance of detecting falls drastically in the case of low SNR and non-stationary noise.In this paper,we study on VAD.Long-term features use a long-term window to split the speech which has been processed of framing and then analyze the characteristics of the new speech.Long-term features are able to improve the performance of VAD in low SNR and non-stationary noise.Long-Term Signal Variability is a kind of long-term feature based on spectral entropy.Researches show that LTSV is more robustness than short-term features and other long-term features.In this paper,we improve the performance of LTSV and propose two new VAD methods that are based on long-term features:(1)Spectral flatness can analyze the distribution of the power spectrum effectively,and it will show a significant difference between speech and noise.In this paper,we propose the VAD method based on the feature of long-term flatness of LTSV by making use of the principle of spectral flatness.At first,the speech should be re-split by a long-term window.And then,we will analyze the distribution in a long time of LTSV.At last,the feature is the variance which measured with the long-term flatness of LTSV of all the frequency in the frame.In this paper,we adopt the method of setting adaptive threshold and the voting decision mechanism for VAD.And the experiment result indicates that the feature of long-term flatness of LTSV owns more discriminative power than LTSV in case of cutting noise and impulsive noise and speech like noise.(2)Compared with the static features,dynamic features can fit the voice better because it is able to analyze the dynamic changes of voice.What is more,the long-term dynamic characteristics can extract more contextual information than the short-term dynamic characteristics.In this paper,we propose the new method of VAD that combines the feature of long-term dynamic characteristics of LTSV with the classifier of setting adaptive threshold and the voting decision mechanism.Compared with the feature of long-term flatness of LTSV and LTSV,the experiment result indicates that the feature of long-term dynamic characteristics of LTSV owns more discriminative power and makes VAD more robust in the case of low SNR and non-stationary noise.
Keywords/Search Tags:Voice Activity Detection(VAD), long-term features, LTSV, spectral flatness, dynamic characteristics
PDF Full Text Request
Related items