Multichannel Voice Activity Detection Base On Neural Network

Posted on:2019-09-30

Degree:Master

Type:Thesis

Country:China

Candidate:S M Wang

Full Text:PDF

GTID:2428330563456746

Subject:Computer Science and Technology

Abstract/Summary:

Voice activity detection(VAD)is a very important preprocessing technology in speech signal processing.Its goal is to determine the voice part and the non-voice part from the speech signal to facilitate the subsequent processing,e.g.automatic speech recognition,speaker recognition.With the increasing popularity of artificial intelligence and human-computer interaction,speech recognition becomes a very important task in speech signal processing and has a very wide range of application prospects.Therefore,the improvement of voice activity detection technology has also been valued by many researchers.As the first step of the speech recognition system,the effect of voice activity detection is crucial.This thesis introduces the most representative traditional voice activity detection algorithms,such as double-threshold method,variance method and spectral entropy method.For these traditional signal processing methods,we can see that in the case of higher SNR speech,we can get a better voice activity detection effect by adjusting the parameters and thresholds.However,experiments have found that in different types of speech or in different noise environments,the robustness of these algorithms is poor.In view of the above problems,the deep neural network(DNN)and convolutional neural network(CNN)are used as models,combined with the multi-channel voice signal collected by the microphone array,After the speech signal features of the single-channel,dual-channel,and five-channel are extracted,they are used as input to the classification model,and comparative experiments are conducted in this thesis.The CHIME3 voice data set was used in this experiment.The noise environments are four scenes of bus,cafe,pedestrian area,and street that are common in daily life.Comparing the experimental results shows that the multi-channel speech signal collected by the microphone arrays is used as an input,the classification effect of the DNN model and the CNN model can be effectively improved,and the classification effect of the neural network model has certain robustness to different types of noise environments.

Keywords/Search Tags:

Voice activity detection, DNN, CNN, Microphone arrays

Related items

1	Study On Speech Enhancement With Microphone Arrays
2	Research On Beamforming Technology Based On Differential Microphone Arrays
3	Study On Key Techniques In Speech Enhancement With Microphone Array
4	Research On Speech Enhancement Algorithms Of Microphone Array
5	Indoor Voice Localization And Track Based On Microphone Arrays
6	Research On Time-domain Voice Activity Detection In Noise Environment
7	The Research On Microphone Array Speech Enhancement Algorithms In Digital Hearing Aids
8	The Voice Activity Detection Technology With Application To Emergency Response Communications
9	Measured Based On The Low Snr Of The Microphone Array Speech Source To The Technical Study
10	Research And Implementation Report-oriented Voice Activity Detection