Variable Selection For Gaussian Mixture Model-Based Clustering And Its Application

Posted on:2017-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:Y W Chen

Full Text:PDF

GTID:2348330503461381

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

In the high-dimensional clustering analysis, traditional methods can not be the effective clustering application due to the increase of the data dimension. Thus, the primary problem of high-dimensional clustering is to find appropriate methods to reduce the dimension of data. This paper combined the dimension reduction of variable selection and the gaussian mixture model-based clustering to implement the type of penalty clustering analysis and its application. Penalty GMM can find the important information of variables for the high-dimensional data. There-fore, we first proposed the L? penalty model of GMM to select the important information for clustering by compressing the maximum average parameters, and the modified bayesian information criterion MBIC select the penalty parameters ? and the cluster number K. Secondly, we put forward the Adaptive L?-penalty model of GMM that do a lighter shrinkage for the unimportant variables and do the heavier shrinkage for the important variables by adjusting the penalty param-eters, which can make up for the L?-GMM excessive punishment of important information variables. Finally,the Adaptive L?-GMM applied in the biological information data,the results show that we get effectively clustering results and mice protein gene expression levels of important information variables when the GMM clustering the high-dimensional data analysis with the penalty term.

Keywords/Search Tags:

Variable Selection, L_?-GMM, Adaptive L_?-GMM, EM Algorithm, High-dimensional Clustering Analysis

PDF Full Text Request

Related items

1	Application And Research On Clustering Algorithm In Large Scale High Dimensional Datasets
2	Research And Application Of Clustering Algorithm On The High Dimensional Datasets
3	On Sparse AP Clustering Algorithm Based On Outliers Detection
4	Research On Clustering Algorithms For High-Dimensional Data
5	Clustering Method Based On Variable Selection And Its Application
6	Statistical Analysis Of High-dimensional Data Based On Feature Selection
7	Research On High Dimensional Data Clustering Based On Improved Evolutionary Algorithm
8	Feature Selection And Clustering For High-dimensional Data
9	Particle swarm optimizer: Applications in high-dimensional data clustering
10	Research And Application Of Rough Clustering Algorithm For High Dimensional Data Sets