Font Size: a A A

The Research On Text Document Clustering Technology Based On Ant Colony

Posted on:2011-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y TangFull Text:PDF
GTID:2178360308477213Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text clustering is an important research orientation in the field of Data Mining and information retrieval. With the increase of stacked data on network, and moreover, most of the data stored by means of text form, the demand of acquiring information from large amount of texts becomes higher and higher for people. Text clustering possesses the capacity of unsupervised learning, and can be conducted automatically by computer. By comparing the similarity of the text, the inherent characteristics and distribution principles of texts can be found, the next step is to organize the web document effectively and form classification model to instruct the categorization of web document, and so as to retrieve and read. Hence, the research on text clustering is particularly important. In recent years, researchers inspired by the phenomenon of accumulation of dead bodies by ants in nature, and proposed clustering algorithm based on ant colony(Ant-colony Text Cluster Algorithm). The combination of ant colony clustering algorithm and text clustering technology produces a new algorithm named ant-based text clustering algorithm. The proposed algorithm has excellent extension ability, parallel computations, and positive feedback, do not need to pre-set number of cluster centers, achieve self-organizing clustering process, with robustness, visualization and other advantages, although there still exist some disadvantages.The idea of the tabu serch algorithm to the ant colony clustering algorithm is introduced in this paper, and the integration text clustering algorithm of ant colony and tabu algorithm ATTCA(Ant-Tabu Text Cluster Algorithm) is proposed. After the ant colony algorithm generates initial solution, the tabu search algorithm based on the initial solution for local search, this will not only overcome the ant colony algorithm features easy to fall into local optimum, but also overcome the taboo search algorithm dependent on the initial solution, to achieve complementary advantages. Experimental results show that the improved algorithm has a higher accuracy than the text clustering algorithm based on ant colony.
Keywords/Search Tags:Data Mining, text clustering, ant colony algorithm, tabu search algorithm
PDF Full Text Request
Related items