Font Size: a A A

Research And Implementation Of The Chinese Automatic Word Segmentation Based On The Ant Colony Algorithm

Posted on:2005-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:X H LuoFull Text:PDF
GTID:2168360125966423Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese automatic word segmentation is the fundamental task of the Chinese Information Processing. The elimination of the segmentation ambiguity is the key factor affecting the segmentation precision. Many researchers have put forward many methods on this topic in the past years. But in the issue of improving the performance of the ambiguity recognition and segmentation, we still have many huge problems.According to our research, we firstly, believe the importance that the research pertinent to the linguistic phenomenons which works on the segmentation precision, so that we can have a good understanding on the very essence of the problem wholly. Secondly, the modeling of the segmentation and the design of the algorithm, we focus on the enhancing of computing ability of the segmentation model. And also we give an intensive consideration on how to measure the linguistic information during the parsing course.As the Ant Colony Algorithm was applied successfully to the well-known Traveling Salesman Problem (TSP) and other hard combinational optimization problems. The author tries to apply it to solve the Chinese automatic word segmentation by designing the data structure of the sentence. Relied on the frequency of the word as the heuristic value, this paper converts the pure segmentation into the problem of the selection of the word smartly. And two computational methods are also proposed in detail to associate the heuristic value with the program of the Absolute Discounting and the BACK-OFF. And the results of the experiments show that our solution is right.On the test set of the unified linguistic resources, we have made an all-around comparison between the results of our solution and ICTCLAS's (Institute of Computing Technology, Chinese Lexical Analysis System) on ambiguity segmentation. As far as the segmentation precision is concerned, we also give a discussion on the segmentation knowledge about the frequency and the semantic information of the word.
Keywords/Search Tags:Chinese automatic word segmentation, ambiguity segmentation, Ant Colony Algorithm
PDF Full Text Request
Related items