With the wide application of speech recognition technology,the accuracy of speech recognition text has become a concern.Due to the limitation of the size of the model on the speech recognition side,it is very difficult to correct errors using contextual semantic information on the recognition side.Therefore,it is of great significance to study how to use recognized Chinese texts to correct recognition errors.The main work of this paper is as follows:(1)This paper proposes a MacBERT error correction model based on phonetics for text mispronunciation errors that occur after speech recognition.This model takes advantage of the fact that typos and original characters are mostly homophonic or near-sounding,and introduces a word-sound similarity index to select candidates for error correction.Through the multi-level candidate words with typos on the verification set,draw a multi-level confidence-sound similarity graph,and delineate the confidence-sound similarity threshold curves for different levels of candidate words to ensure that the corrected characters and typos are similar in pronunciation.And innovate the calculation method of phonetic similarity.First,use the initials,medials,finals and tones of Pinyin to construct the Pinyin code,and then calculate the similarity between the codes by collecting 39,462 easily mixed pronunciations of 4,992 Chinese characters,and then obtain the phonetics of the two characters.similarity.Experiments on the Thchs-30 data set show that the model improves the accuracy index by 20.40%and the recall rate index by 6.31%compared with the existing method,which proves the correctness of the error correction method introduced by word-sound similarity.(2)Aiming at the problems of less data and difficulty in manually labeling polyphonic and enunciated error texts,this paper proposes a threesegment error correction model for polyphonic and enunciated sounds based on BART.This model proposes two methods of text enhancement,that is,introducing noise based on BART noise generator and introducing noise based on previous results,to improve the robustness and generalization ability of the model.After the noise is introduced,the model uses BART for text error correction generation.Comparing the model with existing methods on the Thchs-30 and NLPCC2018 datasets proves the effectiveness of the model.(3)Construct a Chinese text error correction system after speech recognition.The error detection model based on MacBERT in the error detection stage,the MacBERT error correction model based on phonetics in the error correction stage and the three-segment multi-sound swallowing sound correction based on BART The error model is integrated into the system to provide users with an automated text error correction service after speech transcription that integrates speech recognition,error detection,and error correction. |