A New Method Based On Hmms For Noise-Robust Voice Activity Detector

Posted on:2013-10-26

Degree:Master

Type:Thesis

Country:China

Candidate:B Luo

Full Text:PDF

GTID:2248330377953763

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Nowadays, voice activity detection （VAD） has become an indispensable part of speechand audio processing, such as speech recognition, speech classification, speech coding. As apre-processing of speech recognition, even a minor improvement in speech boundarydetection improves the overall system performance in long run.The traditional method of the two-door method-based VAD has become not very goodperformance in more and more complexity polluted noise. Recently, many attractivestatistical model-based VAD algorithms using the likelihood ratio test （LRT） have beendeveloped. They have made significant contributions to voice activity detection progress,especially the statistical methods based on hidden Markov models （HMMs）. However thetraditional LRT model is based on a hidden Markov model with two states, which can notcalculate the observation probability of different states enough. So we proposed a novelmethod that LRT was based on two HMMs, i.e.0-the model of non-speech,1-the modelof speech. During this method, the minor difference between two patterns could be cumulatedby the four states in per model.In this paper, the organization is as follows:Firstly, the applications and significations of VAD have been described in speech andaudio processing. Then, we introduce the research works of VAD at home and abroad.Secondly, we address the elements of HMM and the three basic problems for HMMs.Next, we propose a novel HMMs based on two models four states to detector the voiceactivity endpoints.Moreover, some speech features such as fractal dimension of short time,autocorrelation-based pitch feature have been discussed. Then, we discuss the two ordersdifference MFCC which could more nearly approximate human’s auditory system.Then we use LRT which gathers the two orders difference MFCC and HMMs based ontwo models four states to judge the endpoints of speech. In this section we adopt K-means tocluster the LRT in order to get the threshold between speech and non-speech.Later, a number of results of experiments conclude that the proposed HMMs and twoorders difference MFCC have a good performance in complex noise background than othermethods which are discussed above.

Keywords/Search Tags:

VAD, speech and audio processing, characteristic properties, HMM, LRT, K-means cluster

PDF Full Text Request

Related items

1	Research On Two Typical Speech Processing Applications Based On Deep Learning
2	External/internal data fusion testbed: History, components, and experimental analysis (speech processing, audio processing functions)
3	Biologically inspired auditory attention models with applications in speech and audio processing
4	Key Technology Research On Audio Information Hiding And Information Security Application For Speech Recognition
5	The Recognition Of Radio Frequency Band Abnormal Signal
6	The Research On Fuzzy C-Means Cluster Analysis And Its Applications
7	Based On An Audio Match Of The Smart Broadcast Advertisements
8	Audio segmentation for meetings speech processing
9	Research On Unified Speech And Audio Coding Algorithm
10	Research On Emotion Recognition Of Speech Signal Based On HMM