Font Size: a A A

Clustering Fusion Algorithm And Its Applications In The Mobile Communications Companies

Posted on:2010-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:H LuoFull Text:PDF
GTID:2208360278470185Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering, as one of the hot points of Data Mining Research, becomes an increasingly hot topic. Its major task is the process of grouping the data into classes or clusters so that the groups are valuable. In pace with development of database technology, information data from various walks of life increase rapidly. Moreover, the data types change from pure numerical or nominal to mixed, that cause much difficulty for cluster analysis. Most algorithms can work well when the data should be processed is single type, but the performance are very low if the data is mixed type. In this paper, we put emphasis on clustering methods for mixed data.After studing the exsiting clusterting algorithms, we propose a clustering ensemble that based on graph. It gets clustering members by choosing k-prototypes algorithm and CBL algorithm that can deal with mixed data. The algorithm based on graph changes the object's attributes to graph based on the conception of edges and vertices in graph, and generates a ensemble function based on a proposed weighted shared nearest neighbors graph. Experiments show that the new ensemble algorithm is not only deal with mixed data better, but also more efficiently than the single algorithm. At the same time, the relationships between the four measures of diversity and the accuracy of the clustering ensembles are proposed. Experiments show when constructing ensembles with moderate ensemble size by suitable clustering algorithms for a data set with uniform cluster distribution, the correlation coefficients between the diversity measures and ensemble performance are relatively high.Successfully applying the combined algorithm to segment telecom customers. Analyzing the data of the customer information, communication action, service action, etc, and the relationship between communication action features of customer segments and service types, and between customer segments and profit, the experiment result proves that the clustering ensemble algorithm is efficient.
Keywords/Search Tags:Data Mining, Clustering Ensemble, Diversity Measure, Shared Nearest Neighbors, Customer Segmentation
PDF Full Text Request
Related items