Font Size: a A A

Research On Speech Endpoint Detection Methods Based On Noisy Backgrounds

Posted on:2016-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:C FengFull Text:PDF
GTID:2348330542476027Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The technology of speech endpoint detection,which refers to mark the beginning and endpoints of the speech segment for a pure or noisy speech signal in the speech processing system.It is an important preprocessing step for all kinds of speech processing system.Not only does a positive detection method eliminate the non speech segment so as to weaken the interference for the follow-up processing link,but also can reduce the processing data of speech processing systems,and then the computation time can be reduced.In real life,quiet acquisition environment is rarely exist,hence,the obtained speech signal is always under different background noise.In this case,the research on the technology of speech endpoint detection will benefit the stability and practicability of the speech processing system.The effective methods of speech endpoint detection can thought to be composed of two kinds of research mehods.One is known as the statistical model method,which has good detection results.But the model parameters need to repeated tests and statistics,so the complexity becomes higher and the amount of computation increases,and then the real-time of the signal processing system is seriously affected.The other one describes that we can extract a certain feature value which can highlight the speech segment and suppress noise segment signal.Then the extracted feature can be compared with a threshold,finally the parts of speech segment can be cut out and this method is called feature extraction.This kind of method has small computaion,but when the signal-to-noise ratio(SNR)decreases,the characteristics of speech signal will be difficult to distinguish the speech and non speech segment,and then the correct rate of detection results is affected.In order to improve the accuracy of the endpoint detection under noise backgrounds,this paper proposes the corresponding improved algorithm by reference to several speech endpoint detection methods which are based on the feature extraction.The main contents are as follows:Firstly,an improved algorithm in which cross-entropy is the decision rule.The analyses first estimate the background noise based on the speech presence probability.The algorithm makes use of the sub-band cross-entropy between speech and noise as the speech/non-speech discrimination feature.The algorithm is robust compared with the original method under low SNR levels.Secondly,aiming at the problem of mode mixing in empirical mode decomposition(EMD),an improved methodology based on ensemble empirical mode decomposition(EEMD)algorithm and the teager kurtosis has been proposed.The teager energy operator is used to calculate the teager energy of each intrinsic mode function(IMF),which is decomposed by ensemble empirical mode decomposition.The root power function and order statistics filter are used on the teager kurtosis for feature extraction.The detection of speech segments under low SNR levels can be implemented over the suitable threshold.Thirdly,based on the detection method of observation sequence energy of speech compressive sensing,the signal subspace is introducted as a pre-processing denoising method and later estimate the observation sequence energy of speech.The detection result performs better than the original method.
Keywords/Search Tags:Speech endpoint detection, cross-entropy, empirical mode decomposition, compressive sensing
PDF Full Text Request
Related items