Font Size: a A A

Research On Online Tibetan Speech Recognition System

Posted on:2022-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:X D YangFull Text:PDF
GTID:2518306500456434Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Online speech recognition is one of the important research areas of speech processing and application.In recent years,with the continuous development of network and communication technologies,more and more technologies are applied in online field,and speech recognition is getting more and more attention as an important human-computer interaction technology.At present,the research and application of mainstream languages such as English,Mandarin and Japanese are mature,and there is also a large data speech corpus that is easily accessible.However,there are still some problems to be solved in the direction of Tibetan language recognition,and there are some difficulties in recording the speech corpus of Tibetan as a low-resource language.Based on the fact that the number of people studying on Tibetan language recognition is relatively small,the research foundation is relatively weak,and the actual application is relatively small,this paper conducts research work from the following three aspects.Firstly,by studying the modeling method based on end-to-end speech recognition,the performance of CNN-CTC,LSTM-CTC,and Transformer models on Tibetan language recognition is experimentally compared,and the word error rates of the three models reach 32.6%,30.6%,and 29.3% respectively under the same experimental environment,and the experimental results show that in the Tibetan language recognition task The experimental results show that the Transformer model has the best performance in the Tibetan language recognition task.For the problem of low recognition rate of end-to-end speech recognition in small corpus,this thesis introduces Specaugment speech augmentation algorithm to enhance the original speech data.After introducing the speech augmentation algorithm,the experimental results show that the word error rates of the three models are reduced to 28.1%,26.1% and 25.3%,respectively.Secondly,this thesis completes the framework design of the online Tibetan speech recognition system by analyzing the requirements of the online Tibetan speech recognition system and combining speech recognition technology and web development technology.Built a Web-based online Tibetan speech recognition system based on the B/S architecture,and realized Web-based online Tibetan recognition.Summarized and analyzed the implementation method and design scheme of the online Tibetan speech recognition system.Finally,in order to verify the reliability of the online Tibetan language recognition system,this graduate thesis builds a test environment and conducts functional tests on each module of the online Tibetan language recognition system and verifies whether the system functions normally by analyzing and comparing the operation results.
Keywords/Search Tags:Deep neural networks, Dynamic website design, Tibetan speech recognition, Online speech recognition, Data specaugment
PDF Full Text Request
Related items