Font Size: a A A

Based On Cluster Analysis Of The Value Of Customer Life Cycle Mining

Posted on:2005-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:P P LiangFull Text:PDF
GTID:2208360125961123Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining, which developed in 1980s, has become a hotspot in knowledge research and the foucus of IT. Resently years, academies and business circles had got some fruits in theroy study and tools develop.Clustering analysis is one of the most important method in data mining. Clustering is a basic cognition of human being. Through proper clustering, we can discriminate things easier. Clustering ananlysis can be treated as a tool in finding deep imformation in database, and it also can be treated as pretreatment of other techniques in data mining. Clustering analysis is a challege field, it has several demands: extensibility , ability of dealing with data of different type, ability of finding clusters with random shape, minimized dependability of the imputed parameter ability of treating abnormal data, results of clustering is insensitive to the sequence of the inputed data, ability of treating multidimensional data, clustering which based on restriction, explainable and usable of the result.In the paper, efforts mainly focus on the techniques and theories of Data mining, emphses are on the theories and application of the K-means. Much research works have been done on the related theories. The key contributions are mainly on the following aspects.Study on clustering algorithm, especially on K-mean. In the paper, the limitation on theroy and application of the K-means has been presented: only can be used when the mean of the cluster had been defined; sensitive of "noise" points and outline points; sensitive to the initialize center of the cluster, etc.A new method of finding the initialize center is forward in the paper. Because of the K-means is sensitive to the initialize center and the initialize center is random selected, so we will get different results. We improve the K-means though finding a better initialize center with gridding, called CGKM (Center Finding Based on Gridding K-means) . We partition each dimension into p parts, so we get pmsubdimension. And then, we caculate the desity of each subdimension, in other word, the points in each subdimension. We sort all the subdimensions in descending order of density. And bases on the cluster number we want to create, we choose the highest as the initialize center. Then we implements K-means using these initialize centers. Experiments based random points and training data are presented. Experiments show that the new algorithm can find a clustering result with better quality and less iterations, in comparison to the traditional algorithm.In addition, we put the new algorithm into practice. We presented a data mining model, called CLV-Miner (Customer Lifetime Value Miner) , which face to car business. This model observed the process of data mining, and has functions as follows: data extraction and data transformation, CLV mining(based on different attributes and CGKM), results present.We develop this model through Java and SQL Server, use DTS as the tool of data extractionm, data cleaningN data transformation and data load. We had setup a data warehouse, and OLAP analysis using Analysis Manager. Results are presented through exceK graph, and so on.Finally, we give a summarization of the new method and mining model, which can be treated as the basis of further design and study. And provid a method of design and study of mining model in other trade.
Keywords/Search Tags:Datamining, Clustering, K-means, CRM, Customer Lifetime Value
PDF Full Text Request
Related items