Font Size: a A A

Research And Application Of Pbmmkm Clustering Algorithm Based On Properties Of The Weighted

Posted on:2019-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330578972683Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
K-means clustering algorithm is efficient and accurate for small-scale data set,but for large-scale data,the accuracy of clustering results is significantly reduced.The K-means clustering must determine the number of clustering in advance,and it is very dependent and random on the selection of the initial cluster center points,and it is very sensitive to the noise point.In view of the above problems,an improved K-means based on properties of the weighted is proposed.The paper proposes PWPCA algorithm which combined with Principal Component Analysis and Linear Discriminant Analysis.Using linear mapping of Linear Discriminant Analysis and Principal Component Analysis to reduce data dimension,determining the weight value by calculating the contribution rate of each attribute,using least square method to fitting,data dimension should be reduced for the feature weight value close to zero,in order to achieve the effect of the property weighted feature selection.K-means clustering is performed on data after dimension reduction,and PWPCA algorithm reduces amount of calculation and improves clustering accuracy.Compared with other algorithms,the analysis of experimental results shows that the K-means algorithm based on PWPCA can effectively solve the defect that clustering is sensitive to outliers and low accuracy for mass data sets.A K-means clustering algorithm PBMMKM(Parallel Bisecting Max Min K-means)based on parallel bisecting maximum and minimum distance is proposed.The algorithm divides the data sets into specific class numbers according to the idea of fast parallel bisecting,and uses the maximum and minimum ideas in each class to cluster,and the nearest neighbor class is merged according to the merger principle,and the results of clustering are reflected by the BWP effective evaluation index.PBMMKM algorithm does not need to determine the clustering number in clustering,which effectively solves the problem that the number of cluster numbers in the K-means clustering algorithm must be given in advance and the selection of the initial cluster center points is random.Compared with other algorithms,the simulation experiment show that the PBMMKM algorithm based on attribute weighting has high stability and accuracy.The PBMMKM algorithm is applied to the Customer Relationship Management System,and the comparison of customer cluster analysis is carried out with the K-means algorithm,the MMKM algorithm and the proposed PBMMKM clustering algorithm respectively.The results of customer clustering subdivision show that PBMMKM algorithm makes clustering results more accurate and detailed,and the result of cluster analysis is closer to practical application.
Keywords/Search Tags:PWPCA algorithm, PBMMKM algorithm, BWP index, user subdivision
PDF Full Text Request
Related items