Font Size: a A A

Research On Keyword Spotting Based On Complementary Model And Score Fusion

Posted on:2021-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:L H LiFull Text:PDF
GTID:2428330611966435Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the development of computers and smart phones in recent years,today's society has gradually entered the era of artificial intelligence.Voice is the most convenient way for humans to communicate,which has made voice-based human-machine interaction a hot research topic.Many scientists have begun to focus on voice-based human-machine communication,completely liberating human hands and facilitating human life.Keyword spotting is a research hotspot in the field of speech recognition.It does not need to recognize all the speech content.It only needs to detect several keywords in a continuous speech.It is widely used in phone monitoring,smart home,smart phones and other fields.In this paper,a keyword spotting method based on complementary model and score fusion in low data resource scenarios is proposed.On the basis of modeling keywords using audio experience trajectories,the i-vector(Identity Vector)technology of speaker recognition is used to introduce the keyword modeling method of w-vector(Word Vector).The two models solve the problem of insufficient information expression of a single model.The complementarity of distinguishing information can be obtained by combining scores of the two algorithms,thereby solving the problem of a single model's unreliable decision.The work of this paper is as follows:1.A keyword spotting method based on audio experience trajectories is implemented.It mainly includes three steps: constructing the speech feature space by using Gaussian distribution,calculating the distribution of the class attributes of each keyword's audio features in the speech feature space and the transition probability between the identifiers of the audio samples.A series of algorithm performance experiments are done to explore the influence of parameters such as window length,number of feature space identifiers,label data amounts and similariry calculation method on algorithm performance.2.A keyword spotting method based on w-vector is implemented.The i-vector method in speaker recognition is applied to keyword spotting,and a vector feature representing keyword identity called w-vector is constructed for each keyword.The w-vector of each keyword is obtained by calculating the Gaussian super-vector of each keyword and using factor analysis to reduce the dimension of the Gaussian super-vector.During the detection,the PLDA(Probabilistic Linear Discriminant Analysis,PLDA)score of the audio segment and the w-vector of each word is calculated to obtain the detection result.3.A keyword spotting method based on complementary model and score fusion is implemented.The concept of keyword candidate points is proposed,mainly based on the maximum position of the scoring curve obtained by the two algorithms.Scores of the positions of the keyword candidate points are weighted and fused as a judgment for keyword spotting.On the experiment of detecting 10 keywords,the false rejection rate and the false acceptance rate are 0.195 and 0.197,respectively.The method based on complementary model and score fusion is compared with the method based on audio experience tracjectories and the method based on w-vector to verify its effectiveness,and the performance comparison with the existing algorithms is performed to verify the method based on complementary model and score fusion achieves better results than the method based on the Hidden Markov Model and the method based on neural network in low data resource scenarios.
Keywords/Search Tags:keyword spotting, audio experience trajectories, w-vector, score fusion
PDF Full Text Request
Related items