Font Size: a A A

Research And Implementation Of Mongolian-Chinese Mixed Language Speech Recognition System Based On Deep Learning

Posted on:2022-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y L SuFull Text:PDF
GTID:2518306509954619Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the increasingly close humanities exchanges,bilingual or mixed expression of multiple languages has become a common language phenomenon.Internationally,the use of mixed languages such as Chinese-English,English-German,and EnglishFrench is becoming more and more common.In China,there are more and more mixed languages such as Uygur-Chinese,Tibetan-Chinese,Mongolian and Chinese.Therefore,multi-language mixed speech recognition has become a hot topic in the field of speech recognition research.Although monolingual speech recognition systems such as Chinese,Mongolian,and English have met the requirements for use,and the research on Chinese-English mixed speech recognition is relatively mature,the research based on Mongolian-Chinese mixed language speech recognition is still in its infancy.Therefore,this article combines the characteristics of Mongolian and Chinese to establish a Mongolian-Chinese linguistic corpus,researches modeling unit selection,Mongolian-Chinese linguistic pronunciation dictionary,acoustic model and language model construction,and builds Mongolian-Chinese linguistic speech recognition system.First,this paper constructs the Mongolian-Chinese linguistic corpus,and on this basis,studies the selection of modeling units,and establishes a Mongolian-Chinese linguistic speech recognition baseline system based on the acoustic model.At the same time,based on the advantages of network and network in modeling time-dependent information,the network is introduced to model Mongolian-Chinese mixed language speech recognition acoustics,which further reduces the word error rate.Compared with the baseline acoustic model,acoustic model,and acoustic model,the word error rate of the hybrid-based acoustic model decreased by 11.3%,5.0%,and 5.1%,respectively.Secondly,the hybrid end-to-end speech recognition method is adopted to realize the end-to-end Mongolian-Chinese mixed language speech recognition task.In the training process,this article uses a multi-objective task learning method to train the model.In the decoding search process,combined with the decoder to predict the target sequence,and use the Mongolian-Chinese language model based on the network to further improve the recognition effect.The experimental results show that the Mongolian-Chinese speech recognition system based on the end-to-end MongolianChinese mixed language speech recognition system has lower performance.The main analysis is that the sparse Mongolian-Chinese mixed language data leads to the underfitting of the model.Finally,this paper builds a Mongolian-Chinese mixed language speech recognition application system.It mainly includes client module,system service module,online decoding module and system performance test module.
Keywords/Search Tags:Mongolian-Chinese mixed language, Speech recognition, Acoustic model, Language model, Neural network, End-To-End
PDF Full Text Request
Related items