Research On Language Model Corpus Expansion And Text Error Correction Algorithm For Speech Transcription

Posted on:2024-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhang

Full Text:PDF

GTID:2568307103973689

Subject:Electrical engineering

Abstract/Summary:

PDF Full Text Request

With the development of technology,the transcription accuracy of Automatic Speech Recognition(ASR)has made great progress.However,due to the influence of factors such as the speaker’s accent,environmental noise,and entity nouns,there are still inaccurate transcriptions.Currently,the transcribed text of ASR is not only read by humans,but also can be used as input for a large number of downstream tasks.Low transcription accuracy will greatly affect the performance of downstream tasks.Therefore,improving the transcription effect of ASR is crucial to improving the performance of downstream tasks.In this thesis,we first use the language model after template corpus expansion to correct the transcribed text; then we continue to correct the errors through the error correction method of the transcribed text based on the auxiliary classification task based on the error type,and further improve the transcription effect.The main contributions are stated as follows::1)To address problem of lack of training corpus for the language model in the transcription scenario with many entity nouns,a template corpus expansion method based on the CBert model is proposed.First,the template text data set and the out-of-domain text data set are constructed based on the rules,and different labels are applied to the data sets to obtain the template text label data set and the out-of-domain text label data set.Second,the CBert model is pre-trained and fine-tuned using the template text dataset,out-ofdomain text dataset and template text label dataset.Then,the CBert model is utilized to generate vocabulary confusion sets from the out-of-domain text dataset.Finally,the extended corpus is generated from the vocabulary confusion set through the Beam Search algorithm and the FAQ system.The experimental results show that the language model expanded by the template corpus can effectively improve the transcription effect,so that the CERR(Character error rate reduction)of the AISHELL＿NER dataset is increased by2.7% relative to the baseline,and the CER(Character error rate)is relatively The baseline is reduced by 0.15%.2)In order to further improve the transcription effect,an error-type-based auxiliary classification task transcription text error correction method is proposed.First,use Transformer to build a transcription text error correction model.Secondly,add the auxiliary classification task based on the error type to the model training,combine the loss of the main task and the auxiliary classification task through interpolation,and in the model reasoning,according to the auxiliary classification task The prediction results of the decision whether to correct the sentence.The experimental results show that after correcting the AISHELL＿NER transcripts corrected by the language model of the template corpus extension,compared with the uncorrected text,the CERR increases by 11.61% and the CER decreases by 0.61%,and the model error correction time is reduced by 23.6 seconds.Finally,based on Py Qt5,a visual error correction interface is made to intuitively reflect the error correction effect.

Keywords/Search Tags:

Speech Recognition, Language Model, Transcribed Text Error Correction, Template Corpus Expansion, Auxiliary Classification Task

PDF Full Text Request

Related items

1	Research And Application Of Text Error Detection And Correction After Speech Recognition
2	Studies On Speech Recognition Error Detection And Correction Based On Example-Context
3	Research On Language Model Rescoring And Error Correction Of Transcription Results In Chinese Speech Transcription
4	Design And Implementation Of Chinese Text Error Correction System After Speech Recognition
5	Chinese Research On The Method Of Text Error Detection And Error Correction Under Speech Transcription
6	Research On Automatic Summarization Algorithm For Meeting Speech Transcribed Text
7	Research On Training Data Expansion Method For End-to-end Speech Recognition Model Based On Text Data
8	Researching Of The Mogolian Language Model Based On Speech Recognition
9	Application Research On Statistical Language Model Of Large Vocabulary Continuous Speech Recognition System
10	Research On Statistical Language Model Of Large-Vocobulary Continuous Speech Recognition System