Font Size: a A A

Research And Implementation Of Characteristics Of Complex Sentence Analyzer In Chinese Information Processing

Posted on:2012-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:T XuFull Text:PDF
GTID:2178330335469503Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Chinese information processing as a multi-discipline cross-type subject combinating with computer science, linguistics, mathematics, and information science, is developping rapidly in recent years,with the popularization of Internet, and the developpment of information processing technology.Chinese information processing, information processing for Chinese, including vocable processing,word processing, sentence processing and chapters processing.However, due to the particularity and complexity of Chinese, so far, most studies only remain in the "vocable and word processing" stage, the progress of the processing of Chinese sentence especially complex sentences is very slow.In this paper, characteristics of the complex sentence analyzer is a core part of a relative automatic marking system in the complex sentence project,is mainly responsible for extraction of the basic features of Chinese complex sentence.The characteristics of complex sentence analyzer has six functional modules:1) a calculation of similar sentence structure;2) analysis of syntactic sentence;3) string matching;4) making the part of speech; 5) clause making and span calculation; 6) semantic relationship calculation; 7) relative processing.This paper studies a number of key technologies of the characteristics of complex sentence analyzer:1,It raises an algorithm to calculate the Chinese sentences similarity. The algorithm is a structure of Chinese sentences similar algorithm based on the parts of speech strings, it combinates the correlation between parts of speech to find the longest match ing strings of parts of speech corresponding the two sentences.2,It proposes a algorithm of clause marking. The main meaning of this algorithm: Put a number of independent clauses such as relative forming clause separately, components of sentence forming clause separately, into the adjacent with the principles of practical and efficient.3,It proposes a sentence component analysis algorithm based on dependency syntax. According to some rules of extracting sentence elements:the discriminating mechanisms of predicative core, the discriminating mechanisms of the trunk of sentence, the discriminating mechanisms of the modifier component and the parallel component, the algorithm puts Chinese complex sentence divide into semantic clauses,each semantic clause divides into subject predicate and object, and also divides into the core word, the modified and parallel composition.
Keywords/Search Tags:Chinese information processing, complex sentence processing, the algorithm of sentences similar, clause marking, dependency syntax, sentence composition analysis
PDF Full Text Request
Related items