Researching And Improvement Of Clustering Algorithm In Data Mining Area And Its Application

Posted on:2015-11-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liu

Full Text:PDF

GTID:2298330467975480

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Clustering technology is one of the most critical technologies of all the datamining techniques currently, mainly used for the corresponding classification of datasources. Because clustering techniques play an important role in the process of datamining, it has also attracted more and more attention of the scientific community andindustry area. Meanwhile, the clustering technique has excellent accuracy andefficiency in dealing with the hidden relationship of massive data. The application ofclustering technology is extremely extensive, from general information index,artificial intelligence to analysis of data throughput, Intrusion detection systems arebased on the clustering technology.First of all, this article is introduce common clustering algorithms, and analyzestheir characteristics and application scenarios. Afterwards, analyzing the correlationof K-means clustering algorithm. However, the presence of K-means clusteringalgorithm partially defects, including in the initial stage of the algorithm needs todetermine the number of final clustering results. In addition, K-means poly classalgorithm is also unstable, for the same set of data objects, if the selected initialcluster centers are different, then the resulting clustering results is not the same. Thisfeature is very easy to make the final result is only partial clustering solution, ratherthan global optimal solution.To solve these problems K-means clustering algorithm, some issues has beenimproved in this research, mainly the exclusion of isolated points, as well as selectedaspects of the initial cluster centers to determine the final number of proposedclustering results corresponding improvement algorithm. The ultimate goal is toensure that the improved algorithm clustering results accurate in the last. In theimproved algorithm to determine the number of clusters should be included in thefinal results by using the average silhouette coefficient index function improved,using an improved method to determine the maximum and minimum distance fromthe initial cluster centers, and the density clustering algorithm using a combination ofmethods to exclude outlier.Finally, through an electronic enterprise CRM system application and improvedcustomer segmentation algorithm in conjunction with this paper. First, theestablishment of a CRM customer segmentation models and use the data preprocessedthrough the CRM system as the test data set. In the end, get the results of clustering,customers will be divided into several categories, and made corresponding marketingprograms to these types of customers.

Keywords/Search Tags:

Data Mining, Cluster analysis, K-means algorithm

PDF Full Text Request

Related items

1	The Research And Application Of Improved K-Means Algorithm In Data Mining
2	The Research Of K-means Clustering Algorithm In Data Mining
3	Data Mining Technology And Its Application In The Supermarket In Crm
4	Improvement And Application Of K-means Clustering Algorithm
5	Research Of K-Means Clustering In Data Mining Based On Genetic Algorithm
6	Research And Application Of K-means Algorithm Based On Density And Distance
7	Methods And Applications Study Of Cluster-based Spatial Data Mining
8	The Reaserch Of Clustering Techlogies In Data Mining
9	Researching And Improvement Of Clustering Algorithm In Data Mining Area And Its Application
10	Data Mining, Cluster Analysis Algorithm Research And Application