Font Size: a A A

Voice Activity Detection Based On Morphological Component Analysis

Posted on:2017-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:M G FuFull Text:PDF
GTID:2348330482984839Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The purpose of voice activity detection(VAD) is distinguishing speech and non speech segments in speech signal, speech coding, speech synthesis, speech analysis, speech enhancement, speaker recognition and other applications, VAD plays a fundamental role, and VAD is the preceding steps of other applications.VAD play important roles in two aspects. One the one hand, non speech are useless data of speech processing, excluding non speech help to reduce the amount of useless data processing, improve the efficiency of the system. On the other hand, non speech usually contains noise signal, it will seriously degrade the performance of system. Therefore, it is important to do research on voice activity detection.At present, voice activity detection methods are mainly based on feature parameters, statistical models and machine learning. In the case of varied noise,the performance of the detection method will be affected, the accuracy is not very high. Therefore, to solve this problem, this paper applies the morphological component analysis method to voice activity detection and proposes a method based on morphological component analysis, in order to improve the robustness and accuracy of VAD, its main work is as follows:In this paper, we put forwards two kinds of voice activity detection methods based on morphological component analysis, respectively is the change of the noise of the voice activity detection method, online updating noise dictionary of the voice activity detection method, a dictionary optimization method is also used in the proposed VAD.First, the change of the noise of the voice activity detection method, the main steps are feature extraction, make the signal to be frames, to plus window,and then make the Fourier transform, the amplitude spectrum as a feature; using K-SVD algorithm to train the speech and noise dictionary; using GMM to identify the type of noise, training a GMM model for each type of noise, and then training a good GMM recognition to identify the unknown noise type, select the noise dictionary; using MCA algorithm to do sparse coding and determine the results, make the choice of noise dictionary and speech dictionary into a large dictionary, for the MCA algorithm, through the sparse coefficient, determine the detection results. This method can adapt to different noise environments, and the experimental results show that the proposed method has higher detection accuracy.Second, online updating noise dictionary of the voice activity detection method, the main steps are feature extraction, sub frame, plus window pretreatment, make the Fourier transform to get the amplitude spectrum characteristics; dictionary learning, using K-SVD algorithm to train voice dictionary; noise dictionary online updating,use online dictionary updating algorithm to train noise dictionary; dictionary stitching and using MCA sparse coding, stitch speech dictionary and noise dictionary for MCA algorithm; classify each frame of the signal, through the sparse coefficient to distinguish between speech and non speech frames. This method can directly use the noise of signal to train noise dictionary, do not need the specific noise training data, can make noise dictionary adapt to the signal better. Experimental results show that the method has relatively high detection accuracy.Third, dictionary optimization method, first of all in order to remove unimportant atoms, put forward an algorithm to mark important atoms, so you can distinguish between the importance of the atom. Then in order to remove harmful atoms, propose removing harmful atomic algorithm, so as to achieve the purpose of optimizing the dictionary. Through dictionary optimization we can improve the quality of the dictionary, and then improve the detection accuracy of using dictionary method, the experimental results show that the accuracy of the detection of voice activity detection accuracy is improved after dictionary optimization.In this paper, we did experiments with the above methods, and obtained the results of the experiments. After comparing the experimental results, theproposed method has better robust noise performance and higher detection accuracy, which proves the effectiveness of the proposed algorithm.
Keywords/Search Tags:voice activity detection, sparse representation, morphological component analysis
PDF Full Text Request
Related items