Font Size: a A A

Research On Time-domain Voice Activity Detection In Noise Environment

Posted on:2016-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2348330488974151Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the sustainable development of science and technology, speech communication has exceeded the original viewpoint of interpersonal scope, and now has been applied to more and more fields, which promotes the development of voice-related technologies. Due to the absolute presence of noise, speech systems are often severely affected in terms of accuracy and efficiency. Thus, the noise-related technology is a crucial topic in speech signal processing research, and voice activity detection is one of the key technologies.The purpose of voice activity detection is to estimate the detected signal is speech or noise. Voice activity detection is a front-end voice technology. The voice systems perform further processing by choosing different methods according to determination of results, so the pre-detection in speech signal processing systems can not only reduce the noise interference, but also improve the robustness of the system. Voice activity detection obtain a lot of research both in home and abroad and have achieved some results. What's more, voice activity detection is now a necessary technology in speech recognition system, voice communication system, etc.Voice activity detection mainly includes two categories: time domain and frequency domain. Time-domain methods are mostly that based on energy, zero-crossing rate, average amplitude, and the others, while frequency-domain methods are mainly based on cepstrum, spectral entropy, fractal, wavelet transform, and so on. Acoording to basis of speech signal processing, this thesis presents the principle of voice activity detection firstly, then several common methods in two domains are introduced, including basic principle, implementation procedure, simulation results and statistical property. In addition, advantages and disadvantages of these methods are analyzed in this thesis. Given methodological features and our demand, this thesis emphases on the study of the joint voice activity detection method in time domain, which employs double thresholds—— "zero rate + energy". Although the experimental results show that this method, which "learn from others' strong points and close the gap", can improve the performance of detection, signal simulation waveform still show "interframe interference", "trail" or other shortcomings. Therefore, the latter content makes deep inquiries into energy and zero-crossing rate detection principle and judgment criterion, especially after theoretical deduction and project demonstration, a zero-crossing rate modified model has been established, and ultimately proposed a new double-threshold voice activity detection method in time domain. Experiment prove that this method can effectively solve problems existing in classical time-domain methods.This thesis also has performed accuracy statistical experiment on each method under various noise conditions. By comparison of the results, we further know the difference in some aspects, such as independent and joint methods, classical and improved method, time-domain and frequency-domain methods, and get more in-depth understanding of characteristics of each method. Moreover, this thesis carries out experiment on speech endpoint detection and English voice activity detection, and analyse performance of detecting methods from the perspective of function expansion. Meanwhile, the application scenarios——hardware platform is introduced, and the following content illustrates experimental method on it.Finally, this thesis summarizes the research and acquire outlines of some gains and conclusions, and points out some problems existing at the same time, and thus voice activity detection research and development in the future is prospected.
Keywords/Search Tags:Voice Activity Detection, Zero-crossing Rate, Bessel Function, End Point Detection, Modifying Factor
PDF Full Text Request
Related items