Font Size: a A A

Unsupervised Feature Selection Method Based On Integer Programming

Posted on:2021-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:H Y JiangFull Text:PDF
GTID:2370330605972048Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Due to the rapid development of science and the wide application of information technology,having been produced a large amount of high dimensional data.In order to obtain useful information from those datas,many dimensionality reduction methods are proposed.Feature selection is able to remove the redundant features and preserve the key informative features.It is regarded as an important approach to circumvent the curse of dimensionality problem.In this thesis,unsupervised feature selection method is studied based on integer programming in order to provide an idea for solving optimization model with matrixl2,0 norm,especially a method related to feature selection.The following is the specific work:firstly,proposing a novel unsupervised feature selection optimization model based on the property of the data features and matrixl2,0norm constraint on the projective matrix.In addition,utilizing l p-box approach which is developed by WU Baoyuan et al to carry out equivalent transformation of the optimization model,so as to propose a sparse self-representation based unsupervised feature selection method?lpbox SEFS?.Although,this method has achieved good results,it is possible that the correlation between data features is weak or not,and data samples often contain certain correlation.Based on this,for the purpose of reserving the geometric information about the data and extracting more accurate and effective feature information,we proposed another unsupervised feature selection method which combined with sparse graph structure?lpbox SEFS Graph,lpbox SEFSG?.For the above two models,during the optimization process,it is hard to obtain the global solution of the matrixl2,0norm discrete constrained model.To address this issue,the thesis transforms matrix l2,0norm constraint to be a 0-1 integer constraint.However,the0-1 integer constrained programming is also very hard to solve,then adopting lp-boxapproach Which replaces the 0-1constraints with two continuous constraints to solve this problem.Finally,using the alternating direction method of multipliers to optimize the optimization problem with new constraint,and transforming the unsupervised feature selection problem into several sub-problems and sequentially solve them until convergence.Clustering and classification experiments on five public data sets show that the proposed method in this paper is better than other state-of-the-art unsupervised feature selection methods and can select features with stronger discrimination ability when the number of features is specified.
Keywords/Search Tags:sparse self-representation, graph structure, 0-1integer programming, matrix l2,0 norm constraint
PDF Full Text Request
Related items