Font Size: a A A

Research And Implementation Of Voice Activity Detection Algorithm Based On Convolutional Neural Networks

Posted on:2017-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:R H SunFull Text:PDF
GTID:2348330518493468Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the development of various speech signal processing technology,voice activity detection has been successfully applied to many fields of communication system.Voice activity detection has become an essential part of speech coding,speech recognition and voice classification.To improve the accuracy voice activity detection,especially to improve the accuracy of voice activity detection in complex environments noise is a major issue in recent years.The traditional voice activity detection technology,especially those based on short-term energy and zero-rate,can not meet the need of practical application of complex noise environments.As an intelligent processing method in solving the issue of audio and video classification,convolutional neural network has been concerned as one of hot research by many experts and scholars.This thesis proposed a new voice activity detection algorithm based on convolutional neural network through studying and researching the convolution neural network and combining with the present research results.The main work of this thesis is as follows:First,accomplishing the architecture design of convolutional neural to make sure it is suitable for voice signal processing.Input the MFCC feature parameters and its differential parameters of each frame of the sample voice to the convolutional neural networks,until the convolutional neural network can distinguish between speech and non-speech frames based on the input speech feature parameters through continuous training.Then compare the proposed VAD algorithm with classic ITU-T Annex B VAD algorithm and GSM VAD1 algorithm through simulation analysis.The simulation results show that our proposed VAD is better than ITU-T Annex B VAD and GSM VAD1 algorithms under normal indoor and outdoor simple communication environment and bus stations,airports,cafes and other complex communication environment.Upon completion of the simulation,using C language to achieve the proposed VAD algorithm,and adding it to WebRTC platform.Then use of different types of mobilephone complete the VAD performance testing in office,outdoor and cafeteria environment.And analyze the test results from aspects of subjective MOS value and objective hit-rate.The results shows that our designed VAD is better than the VAD in WebRTC.Finally,the algorithms studied in this thesis is concluded,and an ongoing improvement are discussed.
Keywords/Search Tags:Voice Activity Detection, Convolutional Neural Networks, Mel-Frequency Cepstrum Coefficients, Weight Learning
PDF Full Text Request
Related items