Methods Of Speech Endpoints Detection In Noisy Environments

Posted on:2013-11-23

Degree:Master

Type:Thesis

Country:China

Candidate:D L Hu

Full Text:PDF

GTID:2248330377453865

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Endpoints detection (voice activity detection, VAD) is to find the start and end of thespeech section on an input signal, that is, the speech section is separated from the backgroundnoise. Then, the effective data is provided to the speech recognition system. Though theresearch for several years, the VAD technology has been developed greatly and the satisfyingachievements in the laboratory have been gained. However, there are many types of noise inreality, and their occurring lead to the bad detection results. Therefore, it’s important to studythe VAD methods in noisy environments.The detection methods which have been proposed can be divided into two classifications.One is based on the feature, for there are many features showing the difference between thespeech signal and the noise signal. In this method, the features are extracted at first, and thenthey are compared with the setting thresholds, finally the speech is separated from the noisebased on the comparison. The other one is based on the model. The parameters for the modelsof the speech and the noise must be estimated. The theory of the former one is easy tounderstand and carry out; therefore, it has been widespread used. However, when thesignal-to-noise ratio (SNR) becomes low, the speech can be severely affected by the noise,even submerged by the noise, and then the detection results go bad. The latter one based onmodel has great calculation and complexity; therefore it has difficulty in meeting the demandsof the real-time system.In this paper, the VAD methods based on the features are studied and the simulationexperiments of them have been done. Then some improvements are given to enhance therobustness of the detection in noisy environments. The main contents are as follows:Firstly, the signal in noisy environments are de-noised by wavelet to restrain the noise,and then the detection with decision trees are proposed to improve the traditionaldouble-thresholds method. What’s more, the simulation experiments indicate that the methodwith decision trees works better than the traditional one and the disadvantage of the traditionalmethod causing the falling tendency of the accuracy with the dropping SNRs can bemoderated to some extent.Secondly, some improvements are given based on the algorithm of adaptiveband-partitioning spectral entropy. Before calculating the adaptive band-partitioning spectralentropy, estimating the noise level of the noisy signal is performed to ensure the de-noisingprocess is necessary or not. Because the de-noising process is insignificant for the signal inhigh SNR environment. Then the probability formula with each subband power spectral istested with MATLAB, and the simulation results show that the improved formula can represent the speech section much better than the former one. Compared with some othermethods, the improved method based on adaptive band-partitioning spectral entropy hashigher roundness.Thirdly, the subtractive clustering and k-means clustering are applied in the voiceactivity detection. And the simulation and the analysis are made.

Keywords/Search Tags:

Speech Endpoints Detection (Voice Activity Detection), Decision Trees, Adaptive Band-partitioning Spectral Entropy, Noise Estimation, K-means Clustering

PDF Full Text Request

Related items

1	Research On Voice Activity Detection Algorithm In Low SNR
2	Research On Noisy Voice Activity Detection Method
3	Research Of Effient Speech Enhancement And Voice Activity Detection
4	Research Of Speech Endpoint Detection Based On Spectral Entropy
5	Dsp-based Voice Activity Detection And Neural Network Adaptive Filter
6	Research On Speech Detection Method In Noise Environments
7	Study On The Voice Activity Detection Method In Low SNR Environment
8	Endpoint Detection Algorithm For Speech Signal In Low SNR Environment
9	Speech Endpoint Detection Based On Statistical Models
10	Research On Tibetan Voice Activity Detection Algorithm