Font Size: a A A

Based Semi-supervised Clustering Algorithm With Applications

Posted on:2011-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y XuFull Text:PDF
GTID:2208360308963019Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Supervised learning has to know all the labeled information, while the unsupervised learning does not make full use of the labeled information, which results in the blindness of the clustering. Semi-supervised learning has always been one of the focal problems in data mining realm in resent years because it has semi-supervised learning and unsupervised learning merits.The paper mainly introduces Semi-supervised algorithm and clustering analysis. The search-kmeans chooses the center point only local optimum and not global optimum, which results in unreasonable cluster result. By analyzing the search-kmeans semi-supervised clustering algorithm, a DS-kmeans algorithm is presented and the algorithm using the dichotomy from the unlabeled datasets to choose the center point. The method can make the cluster result global optimum. And meanwhile, by this way, choosing all the center point need to make only a single pass through a datasets. Compared with the search-kmeans algorithm, the DS-kmeans algorithm decreases the computational expense and reduces the time complexity.K-means algorithm and improved algorithms has to assign cluster number, while choosing the cluster number has blindness and randomness. On that basis, a BSC-kmeans algorithm is presented and the algorithm does not know the cluster number and also clusters the datasets. By using Iris datasets BSC-kmeans clustering algorithm are deeply analyzed by changing the threshold and labeled class information and automatically generated datasets finally compared with the same class .algorithm search-kmeans algorithm. And by testing, the BSC-kmeans algorithm is better than the search-kmeans algorithm in the accuracy.At last, BSC-Kmeans and DS-kmeans algorithm are researched based on application of Haier's customer segmentation. By the comparison, the segmentation of Haier's different customer group is implemented and characteristics of each group are researched. Relevance judgments will provide an ancillary support for company's business analytics and decision making.
Keywords/Search Tags:Data Mining, Semi-supervised, Clustering Analysis, Customer segmentation
PDF Full Text Request
Related items