With the development of technology,the transcription accuracy of Automatic Speech Recognition(ASR)has made great progress.However,due to the influence of factors such as the speaker’s accent,environmental noise,and entity nouns,there are still inaccurate transcriptions.Currently,the transcribed text of ASR is not only read by humans,but also can be used as input for a large number of downstream tasks.Low transcription accuracy will greatly affect the performance of downstream tasks.Therefore,improving the transcription effect of ASR is crucial to improving the performance of downstream tasks.In this thesis,we first use the language model after template corpus expansion to correct the transcribed text; then we continue to correct the errors through the error correction method of the transcribed text based on the auxiliary classification task based on the error type,and further improve the transcription effect.The main contributions are stated as follows::1)To address problem of lack of training corpus for the language model in the transcription scenario with many entity nouns,a template corpus expansion method based on the CBert model is proposed.First,the template text data set and the out-of-domain text data set are constructed based on the rules,and different labels are applied to the data sets to obtain the template text label data set and the out-of-domain text label data set.Second,the CBert model is pre-trained and fine-tuned using the template text dataset,out-ofdomain text dataset and template text label dataset.Then,the CBert model is utilized to generate vocabulary confusion sets from the out-of-domain text dataset.Finally,the extended corpus is generated from the vocabulary confusion set through the Beam Search algorithm and the FAQ system.The experimental results show that the language model expanded by the template corpus can effectively improve the transcription effect,so that the CERR(Character error rate reduction)of the AISHELL_NER dataset is increased by2.7% relative to the baseline,and the CER(Character error rate)is relatively The baseline is reduced by 0.15%.2)In order to further improve the transcription effect,an error-type-based auxiliary classification task transcription text error correction method is proposed.First,use Transformer to build a transcription text error correction model.Secondly,add the auxiliary classification task based on the error type to the model training,combine the loss of the main task and the auxiliary classification task through interpolation,and in the model reasoning,according to the auxiliary classification task The prediction results of the decision whether to correct the sentence.The experimental results show that after correcting the AISHELL_NER transcripts corrected by the language model of the template corpus extension,compared with the uncorrected text,the CERR increases by 11.61% and the CER decreases by 0.61%,and the model error correction time is reduced by 23.6 seconds.Finally,based on Py Qt5,a visual error correction interface is made to intuitively reflect the error correction effect. |