Font Size: a A A

Research On Language Model Rescoring And Error Correction Of Transcription Results In Chinese Speech Transcription

Posted on:2022-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:H C QuFull Text:PDF
GTID:2518306542963569Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the wake of developments in automatic speech recognition(ASR)technology,in addition to the study of acoustic model,people also gradually began to study the language model based on neural network(NNLM)in ASR system.At present,because of the factor of decoding speed,the ASR system still mainly uses N-gram language model.However,the N-gram language model is limited by the size of N for the above dependent information.NNLM can take advantage of more context dependencies.Therefore,people can use NNLM to re-score N-best lists to further improve the recognition accuracy of ASR system.On the other hand,there may still be some mistakes in the transcribed text selected after Chinese re-scoring,such as more characters,less characters and so on.Correcting these errors can further improve the recognition accuracy of ASR system and bring convenience to the subsequent transcribed text processing.This thesis mainly studies the extraction of N-best lists from the lattice generated in the first decoding of ASR system for re-scoring and text correction of transcription results of ASR system.1.Research on N-best lists rescoring based on Transformer-XL language model.In the specific field of public party affairs,this thesis first constructs the corpus data set of the field of public party affairs and constructs the vocabulary of the field.Then,the Transformer XL language model with word boundary information is trained from the datasets and vocabularies of this domain.The model uses Transformer encoder and has strong capability of context feature extraction.At the same time,due to the segment-level cycling mechanism of the model,the model can handle longer dependency.It can be seen from the corresponding experiments in this thesis that the recognition accuracy of ASR system can be further improved by re-scoring N-best lists after the addition of part of speech boundary information in the model.2.Research on error correction of transcribed text based on the Bert language model.Generally,due to pronunciation and other problems in Chinese ASR system,there will be some wrong words and missing words in the transcription results,and there will also be some meaningless modal words.Aiming at these problems,this thesis uses the idea of sequence labeling to find errors,and then through the language model to correct errors,so as to make the transcription results more accurate and smooth.Among them,we mainly based on the characteristics of the task,the construction of the data set,such as homophone errors often appear in the transcription of speech.
Keywords/Search Tags:Transformer-XL language model, N-best Lists re-rating, BERT language model, Text error correction
PDF Full Text Request
Related items