Font Size: a A A

Generalized Likelihood Ratio Test For Voice Activity Detection Based On Source-Filter Model

Posted on:2014-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2268330425481403Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of multiple techniques in speech signal processing, the Voice Activity Detection (VAD) has been applied successfully in many regions in communication system and is motivated by different interests. The VAD techniques can increase the efficiency of the communication channel more than100%, which is important in the mobile communication and satellite communication and some other applications which have a limited bandwidth. It can also decrease the average energy consumption of the handheld devices and make the communication channel adopting more date sets at the same time. The VAD techniques are used in audio/speech supervision which need to storage massive sound information, and help to save the capacity of storage greatly.In the past several decades, VAD techniques have developed rapidly. Besides the classical energy-based method which exploiting the energy difference of the speech signal and noise, many other VADs based on different principles have been proposed, such as statistical-based detector which using the difference of high-order statistical information between speech and noise, wavelet-based VAD, HMM model-base VAD and VAD using cepstrum etc.In this paper, a practical "source-filter" model of speech is proposed based on the physical scheme of the generation of human speech. Based on this model, by adopting the pitch information and the linear predictive analysis method, a Generalized Likelihood Ratio Test (GLRT) Detector which consisted of two sub-detectors is constructed. The two sub-detectors are used in detecting voiced speech and unvoiced speech separately. The pitch information was incorporated in the voiced speech detector L1and the linear prediction information were used in both voiced speech detector L1and unvoiced speech detector Lo by a "estimate-and-plug" method. Further, a linear combination was used between the two sub-detectors, taking each sub-detector as a feature, using a linear discriminant analysis theory to calculate he optimized weight feature, a feature optimized generalized likelihood ratio tester is constructed.The energy based detector work unsatisfactorily under the low SNR conditions and low-energy speech signal, while statistical featured based detectors are often need massive computation which is a heavy burden for real time speech processing system. The proposed VAD avoids both the disadvantages by exploiting the features of source of speech and modulation of vocal tract. It improves the performance of voice activity detection predominantly with a tolerable increase in the computation complexity especially in the Linear Prediction Analysis-Synthesis application. The stimulation and experimental results confirm the outperformance of the proposed detector in different uncorrelated noise conditions and the robustness of the detectors.
Keywords/Search Tags:Voice Activity Detection, Generalized Likelihood Ratio Test, LinearPredictive Analysis-Synthesis, Source-Filter Model, Feature Optimization
PDF Full Text Request
Related items