Research On Amdo Tibetan Speech Recognition Technology Based On MRDCNN?CTC?Transformer Transformer

Posted on:2022-11-28

Degree:Master

Type:Thesis

Country:China

Candidate:B J Gong

Full Text:PDF

GTID:2518306752493284

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Speech recognition technology takes speech as the research object.Through speech signal processing and pattern recognition technology,computers can automatically recognize and understand human spoken speech,and convert speech into corresponding text sequences.It has important application value in projects such as home furnishing,unmanned driving,and mobile robot voice command interaction.With the development of deep learning technology,speech recognition technology based on neural network has gradually become a research hotspot at home and abroad.Due to the limitations of corpus,natural language technology and multiple dialects,Tibetan speech recognition has developed slowly,but its needs are very urgent.In order to promote the development of Tibetan speech recognition technology,this paper takes Amdo Tibetan speech as the research object,adopts deep learning method,constructs Tibetan text corpus and Amdo speech corpus,corpus preprocessing,construction of acoustic model and language model,etc.The key technologies of Amdo Tibetan speech recognition are studied.(1)Corpus ConstructionBy analyzing Tibetan text features and Amdo Tibetan phonetic features.This paper collect a 284.2MB text corpus of different types and a 170-hour Amdo Tibetan phonetic corpus.(2)Corpus preprocessingAccording to the actual needs of Tibetan speech recognition in Amdo,this paper preprocesses the corpus,such as normalization,character segmentation/labeling,etc.We formulate Tibetan digital text classification and specification rules,design a speech recognition-oriented Tibetan character segmentation/labeling.Labeling algorithm,and statistics of Tibetan character distribution.The average accuracy rates of Tibetan digital text classification and specification are 99.45% and 99.28%,respectively,and the accuracy of Tibetan script segmentation/labeling is 99.99%.(3)Acoustic Modeling and Language ModelingOn the basis of analyzing the speech features of Amdo Tibetan,we design a speech recognition model of Amdo Tibetan by MRDCNN?CTC?Transformer with the Tibetan script as the modeling unit.Classification algorithm MRDCNN?CTC,language model using Transformer.(4)Design and Implementation of Amdo Tibetan Speech Recognition SystemOn the basis of constructing the acoustic model and language model of Amdo Tibetan speech recognition,we design and implement an Amdo Tibetan speech recognition visualization system based on MRDCNN?CTC?Transformer.And the performance of the acoustic model,language model and recognition system is verified experimentally.Experiments show that the error rate of the acoustic model is 18.67%,the error rate of the language model is 2.8%,and the error rate of speech recognition is 18.87%.

Keywords/Search Tags:

Amdo Tibetan, Speech Recognition, Modeling Unit, Acoustic Model, Language Model

PDF Full Text Request

Related items

1	Research On Amdo Tibetan Speech Recognition Technology Based On Deep Learning
2	Amdo Tibetan Speech Recognition Based On Deep Neural Network
3	Research And System Realization Of Tibetan Continuous Speech Recognition Technology
4	A Study On The Extraction Of Speech Depth In Tibetan Language And Its Speech Recognition
5	Research On Tibetan Language Model For Continuous Speech Recognition
6	Research On Speech Synthesis Technology Of Amdo Tibetan Based On Seq2Seq?WaveNet
7	The Research On Segmentation Acoustic Model Based On MPE Tibetan Lhasa Dialect
8	The Recognition Model Research Based On Whole Acoustic Structure Features Of Speech Unit
9	Study And Improve On The Mongolian Speech Recognition System
10	Acoustic Modeling For Continuous Speech Recognition