Font Size: a A A

On Sparse AP Clustering Algorithm Based On Outliers Detection

Posted on:2019-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2348330569989655Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Assuming that most of the variables are noisy or redundant when we analysis and process of the high dimensional data,and only a small fraction of the variables are useful for data mining,so it is necessary to select variable.In this paper,we first detect outliers in high-dimensional or ultra high dimensional data and reduce the dimension by employed weighted principal component analysis(WPCA),then we use LOF algorithm that is particularly effective in low dimensions to identify outliers in the transformed space.After removing the outliers,we combine the idea of sparse with the traditional AP clustering algorithm and utilize the theory that clustering minimizes the within-cluster sum of squares,proposing sparse AP clustering(Sparse AP).Finally,we apply the Sparse AP clustering algorithm in the simulated data and the real gene microarray data,the results show that we get effectively clustering results and variable selection.
Keywords/Search Tags:High dimensional data, Outlier detection, Sparse AP clustering, Variable selection
PDF Full Text Request
Related items