Font Size: a A A

Chinese Participle Algorithm Research Based On Word Table Structure

Posted on:2008-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y HeFull Text:PDF
GTID:2178360212983396Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
It is well known, English takes the word as a unit, which is separated by blank .But Chinese takes the character as a unit, including the character to the sentence ,which can describe a complete meaning. The computer may very easily understand English word, but Chinese sentence which is composed by the word, which can be understood through Chinese participle technology. Cuts Chinese character sequence into significant word, namely Chinese participle, also the name cuts the word. Chinese participle technology was important technology in the information Processing domain. This article has mainly done several researches to Chinese participle technology.As it follows.This article briefly introduced the Chinese participle basic concept,as well as the domestic and foreign research situation and expounded cadre of Chinese participle system and the principle of work. It mainly studied the existing Chinese electron word table structure, and discussed the existing Chinese participle algorithm technical characteristic. It practiced and proposed the related method and technology.In the base of thoroughly analysis foundation and some characteristics of Chinese character, the author propounded one kind of brand-new data structure for Chinese sentence. The basic principle of structure is to unit all individual characters, the word and the phrase into words and expressions for establishing the electronic dictionary. Based on this kind of brand-new data construction, this author discussed and realized one kind of improved participle algorithm - close neighbor to match.In view of the fact that Chinese name is hard to recognize and mostly belong to unloading word. For that characteristic, this author practiced and recommended one method which is using statistical information and boundary information, candidate name for competition, rule filtration to recognize Chinese name. It also made some improvement in the aspect of filtering name and feedbackThe author built testing platform under the laboratory environment, carrying on the second developments based on lucene.It done the test to the function and the performance for this research results. The result indicated that, this topic research center phraseology meter reading has the higher visit efficiency according to the structure, also greatly reduced the storage capacity. The close neighbor match method has higher cut speed and accuracy compare with the used participle algorithm. This article, which proposed Chinese name recognition method, also has the better recognition effect.It cannot accomplish in one action or to solve in one or two years regarding Chinese participle technology research. The future research topics are as follows: the related lexical category and the word meaning question, the containing of different meanings and filtration to boundary graduation threshold value.
Keywords/Search Tags:Chinese participle, the word sheet, the name recognition, the participle algorithm, the neighbor mat
PDF Full Text Request
Related items