Font Size: a A A

The Research On Fuzzy C-means Documents Clustering Based On Ant Colony Optimization

Posted on:2011-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2178330332965286Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Along with rapid development of information storage technologies and the communications technologies, the information people are faced have an explosive growth. In order to find some interesting information, people always need to do clustering with mass documents. Thus, they can find their target quickly. Document clustering is an important part of text mining, which is aim at dividing the document automatically according to some similarly rules. And make the text achieve high cohesion and high degree of polymerization. How to use the computer to do text clustering has become a research topic of significant value, and it have broad application prospects.Text itself is characterized by ambiguity, polysemy, and it may form a high-dimensional vector by convert into the vector that computer can handle. Because Fuzzy C-Means(FCM) algorithm can solve the ambiguity problem, and it has a liner complexity as well. Therefore, Fuzzy clustering is now a focus of text clustering.This paper is based on ant colony algorithm of swarm intelligence, analysis the FCM algorithm's correctives methods of the short comings. The main work is as follows:(1)This paper makes the assay of main documents clustering algorithm. Compare the advantages and disadvantages of various documents clustering algorithm. Then, this paper presents a cluster optimization algorithm that uses ant colony clustering algorithm to find the initial cluster center of the document set.(2) Through in-depth research and analysis on the ant colony algorithm, we find that fuzzy clustering can help solve the ACO(ant colony optimization clustering) to overcome the nonlinear problems. And the ant colony clustering can help FCM to solve the sensitive issues of initial clustering center. Therefore, this paper proposes an algorithm which mixture ant colony with FCM to achieve the both benefits of the two. (3) This paper proposes a text clean algorithm based on entropy calculation which proceeds in the word participate.(4) This paper uses the general Chinese and English documents set, and program in vs.net. We extended the algorithm Fuzzy C-Means based on ant colony clustering into experiment. The results explicate that our algorithm is effectual.
Keywords/Search Tags:Text mining, Fuzzy document clustering, Ant Colony Algorithm, data clean
PDF Full Text Request
Related items