Study On Clustering Algorithm

Posted on:2007-12-03

Degree:Master

Type:Thesis

Country:China

Candidate:G B Li

Full Text:PDF

GTID:2178360212967849

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

At present, clustering has achieved great success in many fields including pattern recognition, system modeling, image processing, data mining, etc. Its basic algorithms have been widely applied in the life science, medicine, social science, geography science and so on. Clustering is the process of grouping a set of physical or abstract objects into classes of similar objects. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. It is a typical unsupervised algorithm. This thesis focuses on key techniques and algorithms of cluster analysis. The main research content includes:(1) Firstly, aimed at deficiency of traditional K-means clustering, this thesis proposes an improved algorithm. When one class changes less after being modified, keep the old class center. Therefore, there is no need to compute the new class center and the distances between samples and the class. Experiences prove the improved algorithm spends less time in clustering with satisfactory precision.(2) A hierarchical clustering algorithm based on granularity is presented. In one iterance, if the distance between any pair of clusters is less than the given threshold, they are regarded as adjacent clusters under the current granularity and are merged. The process repeats until satisfying the condition. Experiments show that this algorithm can achieve hierarchical clustering of a data set, and require less time with precision ensured.(3) Inspired by the CURE algorithm, this thesis puts forward a new clustering algorithm, which represents each cluster using multiple Deputies. It obtains clustering results firstly by partitioning samples into fewer atomic clusters, and merging the adjacent atomic clusters. Experiments prove that the algorithm is more robust to outliers, and can identify clusters having non-spherical shapes and wide variances in size. It is also a linear-time clustering algorithm, and therefore, it facilitates the clustering of a very large data set.(4) A new method is presented, which combines the hierarchical clustering...

Keywords/Search Tags:

cluster analysis, algorithm, time complexity, Hierarchical Cluster, K-Means

PDF Full Text Request

Related items

1	The Research On Fuzzy C-Means Cluster Analysis And Its Applications
2	Based On The Application Of Cluster Analysis Of Water Pollution Monitoring System
3	Research And Application Of K-means Clustering Algorithm
4	Research And Application Of Improved K-means Algorithm In Multivariate Analysis System
5	Research Of Improved K-means Algorithm And New Cluster Validity Index In Cluster Analysis
6	Methods And Applications Study Of Cluster-based Spatial Data Mining
7	Improved Fuzzy C-Means Clustering Algorithm
8	Research On The Evaluation Methods Of Cluster Analysis Results
9	Watermark Vector Of Characteristics In Image Watermarking Applications
10	Theoretical And Applied Research On Fuzzy C-means Clusteirng And Its Cluster Validation