Font Size: a A A

Research On Voice Activity Detection Method Based On Ensemble Learning

Posted on:2019-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:P C LiaoFull Text:PDF
GTID:2428330566999272Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of information technology,speech interaction is becoming more and more important in information transmission,so speech signal processing is becoming more and more important.In the speech signal processing technology,the voice activity detection,as the pre level technology of the speech signal processing,is the necessary link in speech enhancement,speech recognition and so on.Its accuracy has a great influence on the whole speech signal processing system.Although in the environment of high signal-to-noise ratio,the traditional voice activity detection method has reached a high accuracy,but in reality,there are many kinds of noise and low signal-tonoise ratio comparison,which leads to a sharp decline in the effect of speech activity detection.Therefore,it is of great significance to improve the accuracy of voice activity detection under different background noise and low SNR.Based on summarizing the existing voice activity detection algorithms,this thesis proposes to use the integrated learning method gcForest to detect voice activity.The ensemble learning algorithm is a new machine learning method.It uses a combination of multiple learners,and can adapt to different learning tasks according to different combinations.Compared to a single learner,ensemble learning can significantly improve the system.The generalization ability brings new solutions for voice activity detection.This thesis mainly did the following work:First,on the basis of a large number of documents,the advantages and disadvantages of the existing voice activity detection algorithms are analyzed and verified from the simulation level.Second,using the MFCC coefficient as the feature of ear perceptive characteristics,the integrated learning framework gcForest is used as the classifier to recognize the data.The data is modeled using the better learning ability and generalization ability of gcForest,and the K fold cross validation is used to prevent over fitting.The simulation shows the integrated learning classifier.Compared with SVM,it has better detection and noise immunity.Third,from increasing the diversity of integrated learners,by adding two different kinds of AdaBoost integrated learners in the gcForest framework to achieve the disturbance of input attributes and algorithm parameters to increase the diversity of the whole framework.The improved gcForest framework is less sensitive to noise when it comes to noise.Fourth,using the HMM as a front-end structure of the gcForest framework,the HMM is used to model the speech data with better modeling ability for dynamic data and the Viterbi algorithm is used for decoding.The decoded N best recognition results are used as gcForest Feature vectors to further improve the performance of voice activity detection.
Keywords/Search Tags:Voice Activity Detection, Ensemble Learning, gcForest, AdaBoost, HMM
PDF Full Text Request
Related items