Font Size: a A A

Design And Implementation Of Collocation Repositories In Chinese Intelligent Input Method Based On Grammar And Semantics

Posted on:2007-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:W J LiangFull Text:PDF
GTID:2178360185954029Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Chinese information processing is to carry on the automatic processing of the Chinese language information by computer. In this field, how to settle the Chinese input problem is a primary, but crucial task. Although some non-keyboard Chinese input products have been brought out, keyboard-input technique still is the most popular means as well as a very important task in Chinese information processing field. In the existing keyboard input Methods, some are Input by character or word, some are by phrase or sentence. All these input Methods are not very intellectual, To heighten the intelligence of the input method, the Chinese intelligent input method based on grammar and semantics is designed. It uses the knowledge of collocation,grammatical collocation and semantic collocation to promote the intelligence. My paper focuses on designing and implementing these collocation repositories that the input method used. The detailed work is as follows:1,Design and implement the binary-word collocation repository. Firstly, give the long and near distance span dynamically to extract the candidate collocations .secondly, use several improved statistical models such as long distance collocation intensity,near distance collocation intensity,long distance collocation disperse,near distance collocation disperse and pinnacle to filter the candidate collocations. Thirdly, filter the candidate collocations with linguistic knowledge artificially and deposit the result into the binary-collocation repository.2,Design and implement the tri-word collocation repository. Get every collocation from the binary-collocation repository, regard it as a keyword, repeat the procedures of building the binary-collocation repository, then deposit the results into the tri-collocation repository.3,Design and implement the grammatical collocation repository. There are internal grammatical collocations in phrases and sentences, so we extract grammatical collocation instances based on grammatical collocation rules in the abstract grammatical repository. The results we extract are deposited into the grammatical repository.4,Design and implement the semantic collocation repository. First Synonymy Thesaurus is used to code word senses and the codes are then used for representing semantic collocations. Every collocation in the binary-collocation repository is marked by the semantic codes. The results after statistic are deposit into the semantic repository.
Keywords/Search Tags:Chinese information processing, Collocation, Grammatical Collocation, Semantic Collocation
PDF Full Text Request
Related items