Research On Clustering Algorithms With Feature Preferences And Their Implementation

Posted on:2016-06-23

Degree:Master

Type:Thesis

Country:China

Candidate:L Fang

Full Text:PDF

GTID:2348330479976573

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As one of the powerful tools of mining data's structural information, clustering has been extensively applied in the fields of image processing, bioinformatics, data mining, etc. Depending on whether introducing feature weight into objective function, clustering algorithms can be divided into two categories, namely the traditional clustering algorithms and feature weighted clustering algorithms. Traditional clustering algorithms(e.g. k-means and fuzzy c-means) have not distinguished the impact of data's features or data's weight on clustering, which may occasionally result in unsatisfied clustering performance due to the correlation and redundancy when dealing with high-dimensional data. Yet the prior feature weighted clustering algorithms may not necessarily meet users' expectation on relative importance or preference. Therefore, to bridge the research gap, this study attempts to propose two kinds of clustering algorithms to fit the real feature preferences of users to the maximum extent and is summarized below:1. We improved the CFP algorithm which is based on Bregman divergence by Sun et al. developed the existing clustering methods with globally-weighted cluster-independent features independent of the cluster to the one with locally-weighted cluster-dependent features, using the actual preferences given by specific users. By way of this, it will reflect the level or degree of importance every characteristic contributes to different types during the clustering. The results can avoid deficiency that the original algorithm only utilizes globally-weighted features. In the meantime, it combines with feature preference constraint in order to make the weights gained by clustering observe the prior relationship between characteristics better. Also, the strategies used in this can develop to relative clustering algorithms. Finally, I verified the feasibility of the algorithm on the experiment results of UCI dataset.2. We proposed the semi-supervised clustering algorithms combining feature information and label information, and reflected the prior information on feature facet in the form of feature preference. Different from semi-supervised clustering algorithm in normal sense, this only gives semi-supervised information on single feature facet or sample facet. Rather, the research develops previous semi-supervised clustering algorithm restricted on a single facet by combining the two kinds of information. Experiments on dataset verify it has better performance than semi-supervised clustering algorithm on a single facet...

Keywords/Search Tags:

Clustering analysis, Feature preferences, Feature weighting, Cluster-dependent, Semi-supervised learning, Semi-supervised clustering

PDF Full Text Request

Related items

1	Research On Semi-supervised Clustering And Classification Algorithm
2	The Study On Personalized Search Based On Semi-Supervised Clustering
3	Semi-Supervised Text Clustering Based On Feature Weighting
4	Semi-supervised Learning On Text Data
5	Semi-Supervised Clustering Analysis And Its Extended Research
6	Research On Semi-supervised Classification Algorithm Based On Clustering Ensemble
7	Research On Clustering Methods And Their Applications
8	Semi Supervised Clustering Algorithm And Its Application And Research
9	Research On Text Clustering Based On Semi-supervised Learning
10	Semi-supervised Clustering And Feature Selection For Symbolic Data