Font Size: a A A

Chinese Spelling Error Correction Algorithm Incorporating Multimodal Semantic Features And Applications

Posted on:2024-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z H TianFull Text:PDF
GTID:2568307127955079Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Chinese spelling correction(CSC)algorithm is an important research direction in the field of natural language processing.Its purpose is to automatically detect and correct spelling errors in text,thus improving the readability and accuracy of text.In the past decades,Chinese spelling error correction algorithms have been extensively researched and developed,covering various techniques and methods.There are multiple error types in Chinese spelling,and these error types can be divided into the following two main categories: phonetic errors and visual errors.These two types of errors are caused by the misuse of phonetically similar characters and visually similar characters,respectively.In addition,although the emergence of pre-trained language models(PLMs)has facilitated the progress of the CSC task,there is a gap between the learning knowledge acquired during the training of pre-trained language models and the goals of the Chinese spelling error correction task.In order to cope with some of the above emerged problems,this paper proposes the following approaches to address them:A Chinese spelling error correction method that fuses multimodal feature coding while adding textual adversarial is proposed.First,the combination of character phonetic and morphological information with semantic information effectively solves the problem that the model performs poorly in the face of multiple error types.Second,textual confrontation is added on the basis of fusing multimodal feature encoding,which improves the effectiveness of the model to a certain extent.To verify the feasibility of the method,a large number of experiments are set up on the public dataset,and the accuracy,recall and F1-score are used as evaluation indexes,and the experimental results confirm the feasibility and superiority of the method.A Chinese spelling error correction model based on contrast probability optimization is proposed.Currently,although the pre-trained language model can facilitate the development of Chinese spelling error correction tasks,there is a gap between the relevant semantic information learned and the actual error correction tasks,so Contrastive Probabilistic Optimization(CPO)is added to the fused multimodal model,and its purpose is to help the model learn the error experience and reduce the gap between the pre-trained model at training and the actual error correction tasks through CPO.gaps.To verify the feasibility of the method,a large number of experiments are set up on a publicly available dataset,using accuracy,recall and F1-score as evaluation metrics,and the experimental results confirm the feasibility and superiority of the method.A Chinese spelling error correction system is designed based on the multimodal Chinese spelling error correction model,and the spelling error correction method proposed in this paper is applied to practice.The Chinese spelling open system designed in this paper mainly contains three modules: Web interaction,Chinese text spelling error correction and Chinese document error correction.Finally,through system testing,the system is proved to have sound functions and good compatibility.In summary,two improvement methods are proposed in this paper to address some problems of Chinese spelling error correction.In addition,a large number of experiments are set up on the public dataset SIGNHA,and the experimental results verify the effectiveness of the methods proposed in this paper.
Keywords/Search Tags:Chinese spelling error correction, Multimodal feature encoding, CPO, Chinese spelling error correction system
PDF Full Text Request
Related items