Font Size: a A A

Clustering Ensemble And Classification Under The Perspective Of Three-way Decision

Posted on:2022-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:S B ZhaoFull Text:PDF
GTID:2518306479971899Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The fast-developing internet technology,communication technology,and sensor technology are generating large-scale,heterogeneous,high dimensional,and massive complex data all the time.How to mine useful knowledge from complex data is a hot issue in the field of knowledge discovery and data mining.Clustering analysis and classification are effective methods in the field of data mining.However,when faced with complex data in a realistic environment,the two-way clustering can not to obtain effectively results.Therefore,how to improve the two-way clustering so that it can effectively face the existing complex environment has important theoretical significance and practical values.The clustering analysis is a typical unsupervised machine learning algorithm.The core idea of clustering analysis is to divide similar objects into the same cluster and divide different similar objects into different clusters according to the similarity between objects.So that the samples in same cluster have higher similarity and the objects in different clusters have higher dissimilarity.However,there usually not have a clearly attribution relationship between objects and clusters in real environment.This will lead to the traditional clustering algorithm can not accurately describe the uncertain relationship between objects and clusters.In recent years,the analysis of complex data based on the cognitive science has attracted the attention of many researchers.By introducing the theory of three-way decision,the three-way clustering has been proposed.The three-way clustering describes the structure of cluster by core region and fringe region,which can effectively describe the phenomenon of fuzzy boundaries between clusters.Sine the three-way clustering has been proposed,most of research has been proposed,while there are still few studies on the clustering ensemble problems and the application of three-way clustering.From the perspective of three-way decision,this paper has conducted in-depth research on clustering ensemble and classification problems.To deal with the existing clustering ensemble method can not effectively deal with the uncertainty information in the data sets,we proposed a multi-granulation three-way clustering ensemble algorithm based on shadowed sets.First,it generated a set of clustering members via the fuzzy c-means algorithm.It then maps the membership grade in FCM into three regions by the shadowed sets: the core region,shadowed region,and exclusion region.The procedure will capture the uncertainty and noisy objects in the data set through multiple different clustering results.Second,objects are divided into different approximation regions by analyzing the uncertainty between objects and clusters.Objects in different approximation regions have different importance to clusters,that is,there has a partially ordered relationship between different approximation regions.Finally,the shadowed set is used to classify objects in different approximation regions.Experiments on multiple UCI data sets show that the proposed algorithm obtained better results.To deal with the KNN algorithm not have a higher classification efficiency when faced with large-scale or high dimensional data,we proposed a fast KNN classification algorithm based on three-way clustering(TWC-KNN).The TWC-KNN first clustering the training samples through the FCM clustering algorithm.It then maps the membership grade in FCM into three regions to construct three-way clustering by the shadowed sets.Finally,according to the position relationship between the sample to be tested and the center of each cluster,the original training samples are been cut,thereby reducing the size of training samples and improving the classification efficiency of the KNN algorithm.We conduct related experimental tests on multiple UCI data sets.Experiments on multiple UCI data sets show that the proposed algorithm can effectively improve the classification efficiency on the basis of classification accuracy.
Keywords/Search Tags:three-way decision, three-way clustering, shadowed sets, clustering ensemble, KNN classification
PDF Full Text Request
Related items