Font Size: a A A

The Research Of Voice Activity Detection In Low SNR Environments

Posted on:2012-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:G J WangFull Text:PDF
GTID:2248330395984900Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Voice activity detection (VAD) is a scheme to classify a speech signal into speechand non-speech segments and has been widely used in speech communication systems,such as speech enhancement, speech coding, and speech recognition. Effective VADof speech signals can not only reduce the amount of speech signal processingoperations, but also improve system performance significantly. At present, currentVAD methods have good detection performance in high SNR environments, but as thespeech background noise increasing, its performance declines sharply, and some ofthese methods become invalid, so it is critical to carry the research of VAD in lowSNR environments.Firstly, the speech signal preprocessing methods are described, includingpre-filtering and sampling, pre-emphasis, framing and windowing. Then, the commonVAD methods are introduced in the order of the time-domain characteristic, frequencydomain characteristic, nonlinear characteristic and multi-feature integration. Themathematical model, experiment and analysis of those approaches are given toprovide a theoretical basis for the research of VAD in low SNR. Four novel VADmethods are proposed in respect of the multi-feature integration and nonlinearcharacteristic:(1) In view of the energy spectral entropy feature combines the advantages ofenergy and spectral entropy whereby compensate each other’s limitations, thenonlinear dynamic characteristic of the statistical complexity is applied to the VAD.By combining statistical complexity with the energy feature, a new VAD method ispresented that is energy statistics complexity feature.(2) Owing to the approximate entropy is heavily dependent on the record length andlead to inconsistent results, however, the improved algorithm sample entropy is betterthan the approximate entropy in respect of properties. Therefore, a novel VADapproach is proposed which based on sample entropy.(3) Complexity movement is commonly composed of orderly movement andstochastic movement, the part of stochastic movement is the basis of C0complexitymeasure described. The traditional calculation of the C0complexity is based onFourier analysis, which can only distinguish the difference in the frequency domainsignal and cannot be very effective to analyze non-stationary signals. However, wavelet analysis can use the difference of the signal and noise in the time-domain andfrequency domain, so a new VAD method is presented that is C0complexity whichbased on wavelet transform.(4) The traditional Lempel-Ziv complexity analysis is based on binarycoarse-graining method, due to the time series produced by the binary coarse-grainingmethod is likely to lose some important information on dynamical systems, so themulti-valued coarse-graining method is applied to reconstruct the time series.Therefore, a novel VAD method is proposed which based on multi-valuedcoarse-graining Lempel-Ziv complexity.Furthermore, the fuzzy c-means clustering algorithm and Bayesian informationcriterion algorithm are adopted to estimate the characteristics thresholds anddual-thresholds method is employed in VAD. Experiments on the TIMIT continuousspeech database show that at low SNR environments, the detection performance of thefour proposed methods are superior to those original baseline methods at the sametime complexity.
Keywords/Search Tags:Voice Activity Detection, Energy Statistical Complexity, Sample Entropy, Wavelet Transform, C0Complexity, Multi-valued Coarse-graining, Lempel-Ziv Complexity
PDF Full Text Request
Related items