Font Size: a A A

Research On Keyword Spotting In Continuous Speech Based On Point Process Models

Posted on:2014-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2268330401976771Subject:Military Intelligence
Abstract/Summary:PDF Full Text Request
Keyword spotting is to search a group of given words from continuous and unlimitedspeech. It has broad application prospects. Speech events are the occurrences and mutations ofvarious speech attributes and features. This paper takes speech as a sequence of speech events,and calls speech event as “point”, then formulates a keyword spotting framework based onpoint process models (PPM) in continuous speech. This paper focuses on the building of pointprocesses with phoneme events, and creates different keyword models for keyword spottingbased on PPM. The work in this dissertation is summarized as follows:Building of speech point processes with phoneme events. Deviating from traditionalframe-based dense vector time series representations, the PPM framework represents speech infavor of sparse phoneme events. First, compute frame-level phoneme posteriorgrams withTRAP, then two methods, directly selecting event and selecting event using matched filters,are used for phoneme events verification. At last, speech point processes are built withphoneme events.Spotting keywords with Poisson process based on PPM. According to the distribution ofphonemes of keyword, we model phoneme events with Poisson process, train backgroundmodel and keyword model respectively, then use likelihood ratio for keyword verification. Ifthe candidate speech segment is keyword, the output of detector will be high; otherwise, theoutput will be low. The detector output is thresholded by a suitable value. Experimental resultsshow that this method is effective and reliable.Spotting keywords based on word level discriminative point process models. Due to longrange context dependencies of phonemes of keyword, it is reasonable that directly modelingentire words may permit a more accurate and robust decoding of the speech signal. We createword level discriminative point process models, and see word spotting as a binaryclassification problem. The candidate speech segment is transformed to be a supervector byduration normalizing, segment counting and linking in sequence, then whole word confidencescore is produced by classifier (Support Vector Machine, SVM). This method avoids theassumption of independence among phoneme events, and considers long range contextdependencies of phonemes of keyword, so the performance of keyword spotting improvesgreatly.Spotting keywords based on exploiting word level discriminative point process models.The method segmenting point processes may cause some problems, so we use an alternativemethod by smoothing point processes with a gauss distribution function, improving theaccuracy of describing speech point processes. Then we prove that this method can improve the accuracy of keyword spotting by experiments.
Keywords/Search Tags:Speech Keyword Spotting, Speech Event, Point Process Models of Speech, Poisson Process, Support Vector Machine, Word Level Discriminative Point Process Models, Exploiting Word Level Discriminative Point Process Models
PDF Full Text Request
Related items