Font Size: a A A

Research And Improvement Of K-Means Clustering Algorithm

Posted on:2019-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:X J WuFull Text:PDF
GTID:2438330566483717Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The paper mainly studies is that the K-Means clustering algorithm under the customer relationship management to improve the accuracy of K-Means clustering algorithm.The study on the improvement of K-Means clustering and genetic algorithm optimization K-Means based on manifold learning.Finally,the clustering accuracy of the three methods is verified in the data set of the teddy cup data mining contest.To replace K-Means clustering with manifold learning,we first discussed the theoretical feasibility of replacing K-Means clustering with manifold learning.Then,it can be concluded that the good or bad of spectral clustering is directly related to the construction of similar matrix.The relationship between similarity matrix construction and clustering accuracy was studied by using the control experiment.And the accuracy of the three different methods and K-Means clustering was compared.In the paper,the similarity matrix is constructed based on the manifold clustering based on distance,manifold clustering based on vector similarity and manifold clustering based on norm.Experimental results show that the accuracy of manifold clustering is significantly higher than that of K-Means clustering.The manifold clustering based on vector similarity is the best for improving accuracy.It is concluded that using manifold learning can improve the accuracy of clustering.The GA_Clustering algorithm was Designed for solve the K-Means clustering center unstable faults,design iteration generated clustering center of the initial population is structured,looking for a suitable for this kind of the fitness function of the data set.The genetic algorithm is used to find the global optimum.Use the tag after 400 teddy cup series data mining data as experiment data training algorithm,considering the influence algorithm accuracy of the data scale,population evolution algebra,and the population size of three factors design experiment.The experimental results show that the data accuracy is related to the size of population data,which is not obvious with evolutionary algebra and population size.Combined with the experimental data,the population data size was over 3100,the evolutionary algebra was 30,and the population size was 20,and the algorithm clustering was the best in this data set.The data of the airline customer data of 62,989 data mining competition was used to build a complete set of customer relationship management process.The improved algorithm is applied to customer segmentation,and the corresponding marketing strategy is proposed after the customer classification.According to the results of the experimental data set,the experts were evaluated according to the proportional sampling.The experimental results show that the improved algorithm is helpful to improve the accuracy of the algorithm.The research of this paper has obtained a kind of clustering method which can effectively improve the effect of customer relationship management and build a solution of customer relationship management.
Keywords/Search Tags:Customer relationship management, Clustering, Manifold learning, Genetic algorithm, K-Means clustering
PDF Full Text Request
Related items