Font Size: a A A

Research On Projective Clustering Algorithms With Applications For High-dimensional Data

Posted on:2017-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:T WuFull Text:PDF
GTID:2348330512462254Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis is a key issue in data mining research, and has been widely used in many fields, such as pattern recognition, gene expression, customer segmentation, etc. With the development of computer and storage technology, more and more high-dimensional data spring up in clustering applications, including document data, transactional retail data, spatial data and so on. Due to the universality of these data, high-dimensional data clustering has become an important issue in the field of data mining.For data sets in high dimensional spaces, many conventional clustering algorithms which are well-suited for low-dimensional data do not work well in terms of effectiveness and efficiency, because of the curse of dimensionality. To cluster data in high dimensional spaces, projective clustering methods spark wide interest in recent years. Therein, soft subspace clustering methods, which are the main strategy to solve the problem of high-dimensional data clustering, have been widely studied and applied. However, most of existing algorithms often require require some user-set parameters in advance, and ignore the optimization problems of the projected subspace, thus affecting the accuracy and adaptability of clustering algorithms.In this thesis, we focus on the projective clustering, and propose some new methods based on the existing clustering methods. In addition, the proposed methods are also used in mobile data. The main issues addressed in this study mainly involve two parts: overlapping subspace learning algorithm and non-overlapping subspace learning algorithm. In particular, the majority of our contribution can be summarized as follows:1. For the existing weakness of soft subspace clustering methods and the lack of subspace optimization method, a new soft subspace clustering algorithm is proposed. Maximizing the deviation of feature weights is proposed as the sub-space optimization goal, and a quantitative formula is presented. Experimental results show that the proposed method significantly improves the clustering quality2. A novel adaptive projection clustering algorithm based on relative entropy is proposed to address the comparison of subspace clusters and the parameter settings of subspace clustering algorithms. In the process of clustering, the optimal values of parameter are automatically calculated, relying on datasets and the formula derived. The algorithm is further improved in terms of stability and accuracy.3. A new projective clustering method is proposed for mining the underlying friend-relationship on mobile data, by combining the projective clustering technique. We model the circle of friends on mobile data as a set of non-overlapping subspace cluster such that the problem of mining circle of friends can be easily transformed into a new problem of non-overlapping subspace clustering.
Keywords/Search Tags:high-dimensional data, projective clustering, subspace optimization, subspace cluster
PDF Full Text Request
Related items