Adaptive Clustering Algorithm Alaysis Beased On K-Means

Posted on:2010-10-13

Degree:Master

Type:Thesis

Country:China

Candidate:L Liu

Full Text:PDF

GTID:2178360278465889

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology, a new challenge has been introduced which is how to get and use huge amount of information effectively. We need explicit queries with the traditional search methods, however, sometimes it is hard for us to get an idea of what we want, therefore, how to extract useful information from internet without such explicit queries has became an meaningful research topic. Text mining is an effective method to extract useful information from such unstructured text data. Clustering algorithm is a key technology method for text mining, it could be used to discover useful data distribution and implicit data pattern and we could find useful structure and cluster without background of knowledge.Under these background and precondition, firstly, in this thesis, we review the current status of clustering algorithm; its relations with related research fields are also introduced. In order to pave the road for the following sections in this thesis, we express and discuss the basic concepts of similarity calculation algorithm, distance measure etc in clustering analysis by mathematics. At the mean time, we analyze five traditional clustering algorithms and make a performance contrast between them. Based on analyzing the merits and demerits of these algorithms, we proposed an adaptive clustering algorithm, this method could help us get categories number automatically by finding optimum solution of the discriminate function we defined. In this approach, we could avoid the subjectivity when choosing the number of clusters by our human experiences, under this condition, the demerits of the traditional clustering methods could be avoided, the effectiveness of this manner also be proved by the following experiment. Then this thesis introduces a new topic detection system based on the adaptive clustering algorithm we proposed ahead. This system could discover implicit knowledge in text information flow and also could provide the (?)epresentative keywords for different topic according to their main idea. The results of the experiment show that this system could discover the potential text topic information effectively; the validity of the adaptive clustering algorithm is supported again.Finally, we sum up the works in this thesis, and make a discussion and proposal for the future work to enhance the performance of the algorithm.

Keywords/Search Tags:

Adaptive clustering, Topic detection, Discriminate function, Feature selection, Text mining, Name Entity

PDF Full Text Request

Related items

1	Research On Text Clustering And Its Application In Topic Detection Analysis
2	Research And Analysis On Microblog Hot Topic Detection
3	Forum Based Topic Detection And Tracking Algorithms Study
4	Exploring Temporal Text Mining For News Content Anatomy And Recommendation
5	The Design And Implementation Of The Hot Education News Topic Detection System
6	Research On Feature Selection Methods And Its Applications In Text Clustering
7	Research On Key Techniques In Text Mining
8	Research On Hot Topic Detection Methods For Microblog
9	Feature Fusion Methods For Complicated Text Mining
10	Internet News Hot Mining System Research And Implementation