Cluster Analysis And Its Application On Image Processing

Posted on:2013-08-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Xiao

Full Text:PDF

GTID:1228330395967935

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As an unsupervised learning method, cluster analysis is one of the most important research fields in machine learning. In recent years, data clustering is under vigorous development and cluster analysis has been successfully used in numerous applications, including image processing, text data mining, market research and Bioinformatics.In this dissertation, we focus on two key problems of cluster analysis:similarity measure and the design and application of new clustering algorithms. The goal of clus-tering is to discover similar clusters, and therefore how to define and compute similarity is very crucial for clustering. Based on the existed Gaussian kernel similarity function, we propose a new similarity model. Besides the similarity model, the effect of the used features in similarity measure is also discussed and the intrinsic dimension is introduced as a new feature to improve the similarity measure. According to different clustering problems, designing fast and effective clustering algorithm is very necessary. We give a discussion about the advantages and disadvantages of the existed clustering methods, and propose a fast clustering algorithms, which is applicable for image segmentation. Consid-ering that most images contain noise in reality, in order to reduce the effect of the noise on both image segmentation and other subsequent image analysis, a sparse representation-based denoising algorithm is proposed for mixed noise removal.The main contributions of this dissertation are as follows:(1) A weighted self adaptive Gaussian kernel similarity measure is proposed. The tra-ditional Gaussian kernel similarity measure is suitable for the data set containing clusters with similar density, and moreover it is not robust enough against outliers in the data. Considering that there usually exist outliers and clusters with differ-ent densities in real data sets, we propose a new robust Gaussian kernel similarity measure. Based on the existed self adaptive Gaussian kernel similarity measure, the new similarity measure assigns a weight for each data point according to its neighbor information, and the aim of which is to reduce the similarities between outliers and other points via assigning small weights for outliers. Experimental results show that the proposed similarity measure gives better description of both intra-similarities and extra-similarities, leading to better clustering results.(2) We present a novel similarity measure based on intrinsic dimension. Similarity measure is dependent on not only similarity model but also data features. Each cluster can be considered as a sub-manifold, and the data points can be partitioned via defining a new feature reflecting the topology structure of manifolds. In some cases, intrinsic dimension can be used for distinguishing different manifolds, since the data points in the same cluster are expected to have the same intrinsic dimen-sion while data points with different intrinsic dimensions should lie in different manifolds. Based on its neighbor information, the intrinsic dimension of each data point is estimated and used as a new feature for similarity computation with the traditional features. Experimental results show that the clustering results gained by the new similarity measure are better than the results based on the similarity using only intrinsic dimension or original features.(3) For data sets with complex structure, it is very difficult to get satisfactory cluster-ing results via adjusting the similarity matrix using unsupervised method. Semi-supervised clustering employs limited amounts of labeled data to guide the cluster-ing process, which can gain better clustering results. In this dissertation, a semi-supervised clustering method based on affinity propagation algorithm is proposed. The affinity propagation algorithm is a similarity matrix based clustering algorithm, and its performance can be improved via adjusting the similarity matrix according to some known labeled data or pairwise constraints. The experimental results show that the semi-supervised affinity propagation method can improve the clustering accuracy over the unsupervised affinity propagation algorithm by adding a small number of pairwise constraints.(4) A novel method for data clustering is presented based on Wittgenstein’s family resemblance. The existed clustering algorithms based on similarity matrix either have high time complexity or need to tune some parameters. The new algorithm constructs an adjacency matrix based on the gained similarity matrix, and finds the connected components in the adjacency matrix to partition the data. Compared with the commonly used similarity matrix based spectral clustering methods, the proposed method does not need to compute the eigenvectors, which greatly reduces the time consuming. Moreover, the new method has no parameter when the similar-ity matrix is given. Experimental results show that the proposed algorithm can be successfully applied in image segmentation and the results are very encouraging.(5) In order to reduce the effect of the noise on both image segmentation and other subsequent image analysis, we propose a sparse representation-based denoising algorithm for mixed noise removal. The new algorithm effectively combines a median-type filter with a dictionary learned method and optimizes the proposed l1-l0model via a three-phases method. It uses double-sparsity to make a double-construction, leading to an enhanced restoration. Experimental results show that the new method makes a notable improvement for both impulse noise and Gaussian-impulse mixed noise removal tasks.

Keywords/Search Tags:

Cluster analysis, Similarity measure, Intrinsic dimension, Semi-supervised clustering, Family resemblance, Image segmentation, Imagedenoising

PDF Full Text Request

Related items

1	Study On Semi-supervised Spectral Clustering Algorithm And The Application In Image Segmentation
2	Research On Recommendation Algorithm Based On Semi-supervised AP Clustering And Adaptive Transfer Clustering
3	Research On Robust Segmentation Algorithm Based On Semi-Supervised Fuzzy Clustering
4	A Medical Image Segmentation System Design Based On The Semi-supervised Fuzzy Clustering
5	Research On Telecom Customer Segmentation Based On Semi-Supervised Affinity Propagation Clustering
6	Fuzzy Clustering And The Application On Image Segmentation
7	Reseach On Image Segmentation Algorithm Based On Semi-Supervised Clustering
8	Research On Semi-supervised Selective Clustering Ensemble
9	Based Semi-supervised Clustering Algorithm With Applications
10	The Segmentation And Application Of Modified Clustering Method On Medical Image