Font Size: a A A

Particle Optimization Of CNN-RNN Based ASR System For Uyghur-kazakh-Kirghiz Languages

Posted on:2021-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:K D M H Y M J MuFull Text:PDF
GTID:2518306128979149Subject:Engineering, information and communication engineering
Abstract/Summary:PDF Full Text Request
The development of neural networks and its excellent performance in natural language information processing tasks have brought new opportunities to multilingual information processing,especially information processing in Uyghur-Kazakh-Kirghiz languages.The high-dimensional and long context modeling capabilities of neural networks provide new vitality for processing Uyghur-Kazakh-Kirghiz and multilingual multimedia information processing.This paper studies and implements a complete speech recognition system based on CNN-RNN and other neural network frameworks with vocabulary-derivative minority languages as examples.Based on the characteristics of the agglutinative language,this paper explore the optimization of speech models and language models.At the same time,a multilingual speech and text pre-processing software system was built on the basis of the Uyghur-Kazakh-Kirghiz languages which have similar lexical structure.Firstly,according to the characteristics of derivative lexical features,develop an integrated information processing software environment with a unified user interface.The pre-processing of text and speech information in derivative languages brings great convenience for natural language processing of Uyghur-Kazakh-Kirghiz languages.This framework analyzes acoustic and morphological aspects of three morphologically derivative languages including Uyghur,Kazakh,and Kirgiz from multiple particle levels such as phonemes,morphemes,words,and sentences.The complicated work,such as normalization,unit segmentation,etc.can be integrated into a project to complete a series of pre-processing tasks.The system trains an independent statistical model on a small number of manually prepared corpus,"word-morpheme" parallel sequences.And the morpheme segmentation accuracy of Uyghur-Kazakh-Kirghiz languages reaches96%,92% and 88% respectively.The system is extensible in both languages and functions,and can be embedded with independent statistical models.The efficiency and accuracy of the speech recognition system of minority languages based on neural networks have been greatly improved compared with traditional methods.For large vocabulary speech recognition systems,it is important to choose the acoustic model and language model appropriately.We conducted in-depth research on a smaller Uyghur corpus(THUYG open corpus),adopting the Kaldi open source speech recognition platform to implement the deep CNN-HMM model as an acoustic model,through theoretical analysis and comparative experiments,conducting comparative experiments by incorporating with N-gram and RNN language models respectively.The mainstream representatives of traditional continuous speech recognition technology are GMM-HMM acoustic model and N-gram language model,but their recognition accuracy is not high.Therefore,this paper uses the RNN language model to replace the N-gram language model,and combines with different depths of the CNN-HMM acoustic model to improve the recognition accuracy of Uyghur-Kazakh-Kirghiz language speech recognition.The efficiency and accuracy of the speech recognition system based on neural network are greatly improved compared with traditional methods.The experimental results show that the system based on RNN language model has better recognition result,and reduces the morpheme error of Uyghur speech recognition rate to 15.06%.
Keywords/Search Tags:Uyghur-Kazakh-Kirghiz languages, Morpheme segmentation, Stem extraction, Speech recognition, CNN-HMM, RNN
PDF Full Text Request
Related items