Font Size: a A A

Research On Customizable Speech Command Word Recognition

Posted on:2022-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:S X LiFull Text:PDF
GTID:2518306485459474Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The research purpose of customizable speech command word recognition technology is to allow users to define different speech command words to control the machine to execute commands,so as to achieve personalized command word recognition.As far as the present situation is concerned,the speech command word recognition system mainly includes five realization methods: template matching recognition method based on DTW,method based on complement(garbage)model,method based on continuous large vocabulary speech recognition system,and method based on neural network and end-to-end.In this paper,the basic recognition technology of speech command words is improved by adding user-defined command words module,and a better recognition result can be obtained for the customized speech command words.In this study,the method based on continuous large-vocabulary speech recognition system is firstly adopted to realize a customizable speech command word recognition system.By using the trained basic continuous large-vocabulary speech recognition system,the Lattice of words is classified as Weighted Finite State Transducers and generate a WFST with location and confidence information as an inverted index.When searching,you only need to compose the generated inverted index with the WFST corresponding to the command word to get the position and confidence of the command word.The error rate of customized speech command word recognition is 28.4%.Finally,a customizable speech command word recognition system is implemented by dynamic language model and whitening model.The template sentence is set according to the sentence structure and converted into WFST.The WFST is used as the template sentence to integrate the customizable speech command words of the position to be filled,and to form the dynamic language model of the main path.Combined with the language model used for secondary path trained by similar large data sets,a complete language model is finally formed for the decoding network of the system.The final experimental results of this method are as follows:character error rate is 0.01% and sentence error rate is 0.07% without setting rejection;Using SPN symbol as rejection,the character error rate and sentence error rate are 0.09% and 0.35% respectively.With SIL rejection,the character error rate and sentence error rate are 0.08% and0.35%,respectively.
Keywords/Search Tags:Template Matching, Complement Model, Neural Network, Customizable Command Word Recognition, Dynamic Language Model
PDF Full Text Request
Related items