Font Size: a A A

The Research On The Automatic Proofreading Algorithm Of Recognition Flow

Posted on:2009-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y J WangFull Text:PDF
GTID:2178360242476845Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, all kinds of Chinese input technology are implied for the input process of the electronic information. But with the client's neglect or the limitation of the program the information input can not be 100% right. There will no time and manual work afforded to the manual proofreading. So the automatic proofreading has being an urgent problem to solve. The research of automatic proofreading for Chinese has just begun and few researchers are involved in this field and less is the papers concerned this published in public. There are not many research focused on the special kinds of the recognition of the text, so my research has some practical sense.There are three mainly methods of fault-detecting: the first one is based on the information of the characters words and tags of the context. The second is based on the analysis of the transfer probability between the conjoint words. The last is based on the rule or knowledge of linguistic. Three methods are applied to correct the faults: the pattern matching, changing words table and the approximate matching. In this paper after analysis of the special characteristic of the recognition flow of text, an automatic proofreading method facing the recognition is brought forward. It is divided into several main parts: the combined method of segment and part-of-speech tagging for the recognition flow; the method to generate the confusion set; the method to offer the correction advice. To implement the algorithm, we take the dynamic programming technology which largely reduces the complexity of time and space consume for calculation.The creative parts of the article are as follow: first, the generation method of the confusion set; second, the combined method of segment and parts-of-speech tagging for the automatic proofreading of recognition flow; third, the optimization of the parameter in the function . The result shows that the system can detect and correct the faults in the recognition flow effectively, and achieved the goal.
Keywords/Search Tags:recognition flow of text, automatic proofreading, n-gram, confusion set, fault-detect, fault-correct
PDF Full Text Request
Related items