Font Size: a A A

Research On Voice Activity Detection Method In The Presence Of Noise

Posted on:2018-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:W J BaoFull Text:PDF
GTID:2348330533956555Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Voice Activity Detection mainly refers to the detection of the starting point and the ending point from a section of speech signal,in order to conveniently separate effective speech signal from unwanted speech signal or noise signal and then makes the following processing more efficient.It is widely used in speech recognition system,speech enhancement,speech coding and other tasks.At present,there are two main aspects in the research of endpoint detection.One is to detect endpoint by a threshold,and the common methods include the detection based on short-time energy zero crossing rate and information entropy.The other is based on the detection of pattern recognition,and the common methods include the detection based on Hidden Markov model,support vector machine and so on.The result of speech endpoint detection plays a decisive role in the following speech processing.This study mainly researches the endpoint detection method in noisy environment.In the consideration of a low detection rate that traditional detection methods suffer from at a low SNR,the preprocessing of speech signal has been carried out in the first place in this paper to achieve effective denoising.Then the speech end has been detected with the traditional method based on cepstrum distance detection.In the noise reduction process,this paper uses the knowledge of deep learning,a research hotspot in recent years.The self noise coder in the deep learning has been used for speech denoising and has achieved a certain effect.Due to the complex relationship between noise and speech signal and the influence of additive noise on the sound in our lives,this paper focuses on the detection performance of speech signal in the condition of different noise and different SNR.We selected Factory noise,volvo noise and whitez noise from the NOISE92 speech database and some clean speech data from TIMIT database,which have been used to generate several kinds of synthesized noisy speech with different types of noise and SNR in the experiment.There are five kinds of synthesized noisy speech signals,of which the signal-to-noise ratio is from-10 dB to 10 dB.Through the gradient descent method to train the self noise coder(Denoising Autoencoder encoder,DAE)to realize the reconstruction of speech signal under noise and reach the minimal errors with the original clean speech signal to achieve the purpose of noise reduction.And then we detect the speech endpoint with the detection method of cepstrum distance to improve the detection accuracy of the lower SNR.The experimental results show that the accuracy rate of the traditional method decreased rapidly at low SNR,but through the method for speech endpoint detection proposed in the paper,the accuracy rate of detection of the speech signal has been greatly improved,especially under 0dB of SNR and the correct detection rate is higher than traditional detection algorithm.
Keywords/Search Tags:Voice Activity Detection, Signal to Noise Ratio, Cepstral Distance, Denoising Autoencoder Encoder
PDF Full Text Request
Related items