Font Size: a A A

Research Of K-means Clustering Algorithm Based On Variable Precision Rough Set

Posted on:2018-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:G WangFull Text:PDF
GTID:2428330515999959Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Data mining is an important part in the field of artificial intelligence.And it is a multidisciplinary field.It is widely used in data processing.Rough set theory(RS),which was proposed by Pawlak in 1982,it can deal with the uncertain and incomplete information in the data set,and can be used to extract important patterns from the Set data.However,when there are noisy data in the data set,the rough set cannot improve the accuracy of data processing when dealing with such data.In order to enhance the anti-interference ability of noise data,W.Ziarko in 1993 proposed a variable precision rough set(VPRS)model,this model by introducing a precision ?,to reduce the strict requirements of the RS theory to the approximate boundary region,by which the upper and lower approximation of set is extended to arbitrary precision level ? ?[0,0.5).VPRS is an extension of the application of classic RS.Cluster analysis is based on the differences between objects to reflect the similarity between objects,so that the differences between the objects within the class as small as possible,the difference between objects as large as possible.K-means clustering algorithm is an important method of classification,the algorithm is to select the number of clusters K and clustering center,and the sample data set is divided into several class or clusters.The disadvantage of K-means algorithm is that it randomly selections of the initial cluster centers,the number of clusters K and noise point data have effect on clustering results.This article mainly aims at the shortcomings of K-means algorithm,firstly the rough set theory is combined with the K-means algorithm,adaptive K-means clustering algorithm is proposed;secondly,combining the variable precision rough set theory and the K-means clustering algorithm,proposed K-means Clustering Algorithm based on Variable Precision Rough Set.The effectiveness of the proposed method is verified by experiments on synthetic datasets.The main research work of this paper is as follows:1.The adaptive K-means clustering algorithm is proposed.This paper mainly focuses on the influence of clustering centers,K values and noise points in the K-means clustering algorithm on clustering.The algorithm does not need to set the initial clustering center and the K value,and uses the characteristics of the continuous distribution of the density region of the data objects in the data set.In this algorithm,the upper and lower approximation of rough set theory is combined with K-means clustering algorithm to realize the merging of small classes,finally it can complete adaptive clustering.2.The K-means Clustering Algorithm based on Variable Precision Rough Set is proposed.It mainly aims at improving the influence of the noise point data in the adaptive K-means clustering algorithm on the clustering results.This algorithm is combined with the variable precision rough set theory and K-means Clustering algorithm,using the method of class merging,by setting different radius values calculated sample density of continuous regional small and medium class upper and lower approximation regional,so that more samples into the approximate region,at the same time,calculate the clustering center using K-means Clustering algorithm,the statistics of the number of clustering K adaptive.3.The adaptive K-means clustering algorithm and the K-means Clustering Algorithm based on Variable Precision Rough Set is applied to the classification of synthetic data sets.The K-means Clustering Algorithm based on Variable Precision Rough Set can effectively handle the effect of noise data on the clustering effect,at the same time by setting different radius r to determine the corresponding threshold ?,and corresponding approximate region and boundary region can be obtained.Also the experiment results show that the proposed algorithm has better clustering performance on the synthetic data set.
Keywords/Search Tags:Variable Precision Rough Set, Clustering analysis, K-means Algorithm, Adaptive Clustering
PDF Full Text Request
Related items