Font Size: a A A

Research On Multi-Class Feature Selection With Manifold-Regularized Extended Adaptive Lasso

Posted on:2020-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y M ZhangFull Text:PDF
GTID:2428330575954457Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of science and technology,there are many high-dimensional data in the real world,such as gene sequences,time series and hyperspectral remote sensing data in bioinformatics.Although high-dimensional data has more data than low-dimensional data,but for a particular recognition task,Many of the features in high-dimensional data are redundant and cannot be related to the current task,and there are many redundant and invalid features that are not related to the task,and even noise features that affect the classification accuracy.So if these high-dimensional data directly on the operation will face a lot of difficulties,the most direct performance is the problem of dimension disaster.In addition,if the high-dimensional data is operated directly,the computational and time costs of most learning algorithms will increase with the increase of feature dimensions,the model will become more complex,and its extension ability will decline.In order to solve the problem of high-dimensional data,the feature selection technique comes into being.The meaning of feature selection is that,in order to reduce the dimension of feature space,the process of selecting representative features related to the task of recognition is obtained from the original feature data set.It is a key data preprocessing step in pattern recognition and an effective means to improve the performance of learning algorithm.So far,a large number of feature selection methods have been proposed to select relevant and representative features from high-dimensional features.Such as the Lasso method,Adaptive Lasso method,global redundancy minimization(GRM)method,and so on.In recent years,multi-label learning has also been widely used in various practical application scenarios,focusing on the problem that a sample belongs to multiple categories or tags at the same time.For the large number of features presented by multi-label data,researchers also proposed a large number of feature selection methods from different aspects.For example,Lin et al.proposed a multi-label feature selection method based on maximum dependence and minimum redundancy(mRMR),Liu et al.proposed an online multi-label feature selection algorithm,etc.On the basis of previous research results,this paper proposes a different feature selection method and give a reasonable and effecti-ve iterative algorithm in this paper.The main contents of this paper are as follows:(1)Based on the def,ects of traditional Lasso and extended Lasso feature selection methods,we take the correlation between sample features and categories as constraints,add weight constraints in the traditional Adaptive Lasso model^,and propose an Extended Adaptive Lasso(EALasso)feature selection method.The selection method not only can deal with the feature selection problem of two types of samples,but also can deal with the complicated multi-class multi-label feature selection problem.In the process of optimizing the objective function,the weighted regression coefficient,when the weight is determined,under the constraint of the L21-norm,as much as possible in order to achieve sparse,weight value bigger of the two elements of the sparse coefficient matrix of the values will be compressed to 0,so the regression coefficient is estimated to be characteristic of zero will be automatically deleted,so as to achieve the purpose of feature selection.For EALasso,we also provide effective iterative solution algorithm and corresponding proof of convergence.In this part,we prove the validity of the proposed method by different discriminant methods on multi-category single-label data sets and multi-category multi-label data sets respectively.(2)Based on considering the inherent link between data and direct impact on the local structure information of data itself,this section,we will EALasso method to do the further extension,is proposed based oin graph structured adaptive Lasso(MrALasso)feature selection method,in this section,we added in the objective function figure neat item is that I wish to close in space of the sample points in low dimensional space is as similar as possible,at the same time also increased the inherent connection between data and local structure information,and puts forward the effective convergence of iterative algorithm and the corresponding certificate.Experimental results on multiple related gene datasets show that,our algorithm is more effective than other feature selection algorithms in the same field.But,MrALasso can only solve the feature selection problem that contains two types of samples.(3)In this part,we further extended MrALasso from two categories to multiple categories,proposed the multi-category selection method of adaptive Lasso(EMrALasso)based on manifold regularity,and provided effective iterative solution algorithm and corresponding proof of convergence.The experimental results confirmed the effectiveness of EMrALasso method on multiple data sets.
Keywords/Search Tags:Adaptive Lasso, Weighted constraints, Manifold-regularized, Feature selection, Pattern recognition
PDF Full Text Request
Related items