Font Size: a A A

Study On The Method Of Automatic Proofreading Of Word-level Chinese Text

Posted on:2019-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhuoFull Text:PDF
GTID:2428330545954894Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of Internet data,the quality of Internet information has been declining.News publications,radio and television and other departments have high requirements on the quality of the text,and the proofreading work in these industries is still mainly manual.There are more or less problems in words,phonetic alphabets,numbers,symbols and so on.Thus,the automatic proofreading of the text is of practical significance.This paper studies the automatic proofreading of word level text,including two parts: automatic text error checking and automatic error correction.Text automatic error checking adopts joint model,and automatic error correction is a targeted method based on error types.The specific contents of automatic error checking and automatic correction are as follows:(1)The combination of the conditional random field(CRF)and the n-gram string is used for the joint error detection model in automatic text error checking.The model first uses the conditional random field and the n-gram string to check the text,and then combines the two results to generate the final error detection result.The experimental results show that the accuracy of the detection layer is 95.8%,and the accuracy of the recognition layer is 39.5%.(2)Text error types are divided into three parts: missing,redundant,erroneous generation and disorder.This thesis adopts different methods for different types of errors to automatically correct text errors.The missing type uses the language model for error correction,the redundant error uses the direct deletion method,the error generation error adopts the homophone dictionary to correct the error.This paper focuses on error correction methods based on language models and homonym dictionaries.The correction rate of text error correction reaches 16.7%.This thesis designs and implements an automatic text proofreading system.The system is divided into two modules: an automatic error detection module and an automatic error correction module;The automatic error checking module includes the conditional random field error checking and the n-gram random string error checking function;the error correction module includes a missing correction function,a redundant correction function,and a false generation correction function.
Keywords/Search Tags:text proofreading, conditional random fields, n-gram model, scattered string, joint model, homophone
PDF Full Text Request
Related items