Font Size: a A A

Research On Speech Endpoint Detection Methods Based On Multi-Feature Fusion

Posted on:2020-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:C L ZhuFull Text:PDF
GTID:2428330599964888Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
With the development of information technology,speech intelligence has gradually matured.Speech endpoint detection is very important as a core technology in voice signal processing.The purpose of speech endpoint detection is to effectively distinguish the endpoints of speech from the noisy speech signal.Thereby reducing the computational complexity of speech signal processing and improving system performance.The existing endpoint detection methods tends to have a good detection effect under the environment of high Signal Noise Ratio(SNR).But with the decrease of SNR,the results of endpoint detection are unsatisfactory or even fail.Aiming at this problem,this paper improves the front-end speech noise reduction algorithm and combines the improved multi-feature fusion strategy to detect the double threshold endpoint of speech.The superiority of the proposed method is verified by comparison with other methods.The research work and innovations of this paper are mainly reflected in the following aspects:(1)Combined with speech enhancement technology,a speech single-word endpoint detection method based on the Least Mean Square(LMS)adaptive filter noise reduction and multi-features improvement is proposed.In the process of noise processing,multiple median filtering and smoothing are introduced,which can effectively reduce the noise of outliers in speech signal.And the improved integration of logarithmic average energy and short-time threshold crossing rate for double threshold endpoint detection.(2)A continuous speech endpoint detection method based on S-spectral subtraction and multi-feature is proposed.In order to solve the problem that Short-Time Fast Fourier Transform(SFFT)can't effectively analyze non-stationary signals in spectral subtraction method.S-transform is introduced to make speech more noise-resistant.And the improved Mel Frequency Cepstrum Coefficient(MFCC)cepstrum distance and uniform sub-band variance feature are combined to realize the two-threshold and two-parameter method detection.(3)In order to improve the self-adaptability of the threshold setting of speech endpoint detection.The dynamic threshold setting strategy based on the speech leaderless segment noise estimation is adopted,which makes the threshold change dynamically with each speech segment according to the calculation of the noise of the speech leaderless segment.(4)Aiming at the problems of inadequate operability and large error in traditional evaluation methods,a confidence evaluation mechanism is proposed.And the accuracy of endpoint detection is calculated by means of endpoint detection rate,missed detection rate,algorithm complexity and other indicators,which enhances the reliability and reliability of the experiment.
Keywords/Search Tags:endpoint detection, LMS noise reduction, S-spectral subtraction, multi-feature fusion, dynamic threshold estimation
PDF Full Text Request
Related items