Font Size: a A A

Clustering Ensemble Algorithm And Its Application Research

Posted on:2011-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:J HouFull Text:PDF
GTID:2178360305494627Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Clustering can analyse the data effectively, which has a wide use in many fields, such as data mining, numerical analysis and pattern recognition. Its task is to group the dataset into classes. The objects within a cluster, therefore, have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Cluster is an unsupervised method. But no single method can identify any form of data structure distribution, so we consider applying ensemble technology in cluster analysis. The data prior knowledge can improve the clustering performance. In previous studies, very few clustering ensemble algorithms considered the prior knowledge of the datasets.Based on this situation, this paper focuses on the clustering emsemble algorithm on the basis of the prior knowledge, and discusses the application in the customer segments of China Mobile Communications industry. Main works are as follows.A clustering ensemble algorithm based on prior knowledge and spectral analysis is proposed. Several subsets are produced by using random sampling method, and spectral clustering is used on them respectively to generate several clustering members. Considering that their results are not all good, the concept of a confidence factor is proposed. According to the value of factor, we combine the clustering members that are better. Then using the combining method of co-association matrix, the final result is obtained by using spectral clustering algorithm on this matrix. Through the experiment on the artificial datasets and real datasets, the results demonstrate the proposed method can efficiently improve the clustering performance. And the new algorithm is more efficient than single clustering algorithms and classical clustering ensemble algorithms.Combining with the analysis of the clustering ensemble, this algorithm is successfully applied to the customer segments of Mobile Communications Corporation. By dataming the related attributes such as the data of the customer information, communication action, service action, etc, we analyze the relationship between communication action features of customer segments and service types, and between customer segments and profit. The experiment result proves that the clustering ensemble algorithm is efficient.
Keywords/Search Tags:cluster analysis, data mining, clustering ensemble, prior knowledge, confidence factor
PDF Full Text Request
Related items