Font Size: a A A

Automatically Based On Natural Language Processing, Text Proofreading System

Posted on:2007-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:H DingFull Text:PDF
GTID:2208360185956184Subject:Computer applications
Abstract/Summary:PDF Full Text Request
With the popularize and application of computer and internet, from data processing, information processing to knowledge processing, the need of deepness and extent for language character processing is more and more high. People consider that a country's information processing level and quantity represents his degree to enter information society basically. The processing ability on language character information is directly relation to the international competition ability in network society and network economy. Presently, network society and network economy develop quickly around the world. The chiefly bottle-neck problem that baffle it is the processing problem for natural language. Once the problem of"natural language processing"based on network break through, network society and network economy will develop at very fast speed. Therefore, many countries'scientific research academies and institutions devote lots of manpower and material resources to this field. China also makes the field the emphasis of technology, lists it into keystone project.Chinese text automatic proofreading, a part of application foundation research of natural language processing, taking the chance of the development of e- journal, gradually gets more and more enough attention, also becomes an urgency task.So on the base of researching and analyzing the technology and method of the automatic proofreading of Chinese text, an improved method is presented by the paper. The method check-up texts'errors from the character and word, syntax, semanteme three angles.Firstly, for the errors of text'character and word, utilizing neighborship of character or word, check character and word errors by character string co-occurrence probability. Secondly, for the errors of syntax of text, according to statistic and analysis of a large-scale contemporary Chinese corpus, recognize the predicate focus word and the others sentence ingredient, check the syntax errors. Thirdly, for the errors of text'semanteme, establishing semantic dependency relationship tree based on Hownet knowledge, presents a method that based on semantic dependency relationship analysis to compute sentence similarity, check the semantic errors.
Keywords/Search Tags:automatic proofreading, neighborship, predicate focus word, semantic dependency relationship, confusing words dictionary
PDF Full Text Request
Related items