Font Size: a A A

Several Improved Clustering Algorithms And Applications In Image Segmentation

Posted on:2016-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:X P DongFull Text:PDF
GTID:2348330488957094Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Clustering is an important analysis method in data mining. Using the similarity between data, the clustering algorithm can divide a dataset into natural groups so that data points in the same group are similar while data points in different groups are dissimilar to each other. As one of the most important research topics in many fields such as in machine learning, pattern recognition and data mining and so on, clustering has attracted more and more attention of academia. A number of relevant methods have successfully been applied in many fields such as marketing, biology, text classification, network security, image processing, etc. However, with the gradual expansion of applied scopes, clustering is also faced with many challenges such as processing large data sets, automatically determining the number of clusters, identifying any shape clusters and handling complex data types, etc.This paper makes a deep research on clustering algorithms and presents some solutions to existing problems. The main achievements are as follows:(1) To reduce the limitation of spectral clustering in processing large data sets and the dependence of the scale parameter, this paper proposes a fast spectral clustering algorithm based on density information, and applies it to texture image segmentation. This algorithm consists of two phases: sampling and clustering. In the sampling stage, this paper presents a sampling method using simplified DBSCAN algorithm to divide dataset roughly and receives a small number of representative exemplars which maintain the original structure of data sets, then calculates their density information. In the clustering phase, this paper first uses the density information of representative points to construct a new similarity matrix which can reasonably measure the similar relationship between data. Then a new spectral clustering algorithm is proposed based on the classic spectral clustering framework, and is used to classify the representative points. Finally, combining the results of two stages gets the clustering result of original data set. Experimental results on artificial data sets, UCI datasets and texture images show that the proposed algorithm can handles large data sets effectively and reduces the dependence of the scale parameters and obtains the clustering results with higher accuracy.(2) In order to determine the number of clusters automatically, this paper presents a new mean shift clustering algorithm based on the orthogonal design. The proposed algorithm first utilizes the orthogonal design method to scatter several detectors uniformly into the data space, lets these detectors as the initial points, and uses the mean shift algorithm to move them to their neighboring peak position of probability density function. Then a linked algorithm is used to merge these detectors. Finally, each data is assigned to the class which contains the nearest detector to this data. Experiments on artificial datasets and UCI datasets show that the proposed algorithm can automatically determines the number of clusters and identifies an arbitrary shape cluster.
Keywords/Search Tags:Spectral Clustering, Number of Clusters, Mean Shift, Similarity, Texture Image Segmentation
PDF Full Text Request
Related items