Font Size: a A A

Research On Human Computer Interaction Based On Speech Keyword Spotting

Posted on:2017-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2308330482479463Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human Computer Interaction is the technology that studies the interaction between the human and the computer, which includes keyboard, mouse, speech recognition, gesture input, and sensory feedback, etc. With the development of human computer interaction technology, it is found that speech is the most convenient and efficient way for people to communicate with computer. It has been a long time for people to go for the goal that allows the machine to understand human language or operate according to the will of the human. Keyword spotting is a special form of speech recognition, which is mainly used to find a small amount of specific words from the continuous speech stream in practical application. Compared with the continuous speech recognition, keyword spotting has the characteristics of less resource consumption, high recognition rate and strong practicability. Therefore, the keyword spotting technology is widely used.At present, there are three kinds of speech keyword spotting system:keyword spotting system based on garbage model, keyword spotting system based on phoneme/ syllable and keyword spotting system based on continuous speech recognition. In this thesis, we mainly research on the related technology of keyword spotting based on continuous speech recognition. In this thesis, the theoretical researches mainly focus on two parts. One part is the theory of continuous speech recognition, and the other part is the theory of speech keyword spotting. Some improvements are also made based on the two basic theory. The main contents of this thesis are as follows.(1) In the part of continuous speech recognition theory, we mainly introduce the front end of the speech signal processing, acoustic model, linguistic model and search decoding. The speech signal processing mainly includes endpoint detection, pre-emphasis, frame and acoustic characteristics parameters extraction. In this thesis, the characteristic parameters which are extracted from the signal is Mel frequency cestrum coefficient (MFCC). In order to improve the robustness and discriminating ability of the parameters, the extracted MFCC parameters are transformed by linear discrimination analysis (LDA). Acoustic model mainly includes the hidden Markov model (HMM), Gaussian mixture model (GMM) and subspace Gaussian mixture model (SGMM). In this part, we also use the SGMM-UBM model to replace the traditional HMM-GMM to build acoustic models. Linguistic models part mainly includes the language models which are based on grammar or based on statistics. In this thesis, we use the tree language model based on statistics. The part of the searching and decoding mainly introduces the Viterbi algorithm and the output result after decoding.(2) In the part of the speech keyword spotting, the thesis mainly introduces the lattice structure, keyword search algorithm, posterior probability confidence calculation based on lattice and improvement, keywords output rules and system performance evaluation criteria. When calculating the confidence degree, we introduce the minimum edit distance string similarity function, and its main function is to punish the spotted error. Keywords search algorithms mainly includes dynamic programming algorithm and token passing algorithm.(3) In this thesis, we build a keyword spotting system based on continuous speech recognition. In this keyword spotting system, the main tool we use is Kaldi and the dataset is THCHS-30 released by Tsinghua University. Through the simulation experiment, we analyze the performance of the system improved by the improvements of the continuous speech recognition theory and the keyword spotting theory.
Keywords/Search Tags:Keyword Spotting, Continuous Speech Recognition, Acoustic Model, Linguistic Model, Keyword Search Algorithm
PDF Full Text Request
Related items