Font Size: a A A

Clustering Method Based On Variable Selection And Its Application

Posted on:2018-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2348330533957200Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Variable selection,commonly used in clustering algorithm,is a methodology to select the optimal variables step by step in subspace according to the model fitting.However,when dealing with high-dimensional data,some of the variable selection methods possess computational complexity,so it is expensive to use in practical fields,also the selected variables may cause confusion to the clustering process.Therefore,it is urgent to discover some new effective methods,or to improve the classical methods.This article devotes to the improvement of the original clustering method,by combining feature selection techniques and clustering methods,to effectively improve the precision of the clustering results.This method using the ideas of variable selection—VSCC,proposed by Jeffrey L.Andrews in 2013,by using the correlation between variables to give the procedure of variable selection.To give the partition of the data,an initial clustering procedure is performed to obtain class label so as to calculate within-cluster-variance,then it uses the information of the correlation of variables and within-cluster-variance,to select the key information that then used in final clustering.Two simulated sets and four real examples related to agriculture,medicine demonstrate accuracy and feasibility of the method.
Keywords/Search Tags:Clustering Analysis, Variable Selection, Correlation, Gap Statistics Within-Cluster-Variance
PDF Full Text Request
Related items