Font Size: a A A

Research On Sparse-learning-based Multi-label Feature Selection Method

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiFull Text:PDF
GTID:2428330620972177Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,products with consecutive renovation offer massive data,which makes data complicated and high-dimensional.These high-dimensional data contain a lot of useful information meanwhile noisy information.Feature selection serves for extracting effective information from complicated data.Feature selection technique can select the optimal subset from the original set,then this subset is used for subsequent data analysis.Feature selection becomes an effective method to deal with high-dimensional data because it could reduce the time complexity and improve the efficiency of the learning algorithm.Generally,feature selection technique is divided into three groups based on the types of labels: supervised feature selection,weakly supervised selection and unsupervised feature selection.Additionally,Based on the selection strategy,feature selection technique can be categorized into three models: filter models,wrapper models and embedded models.We focus on supervised methods in this paper.Furthermore,embedded models integrate the advantages of the filter models and wrapper models,and overcome the negative impact of filtermodels and wrapper models.Thus,the embedded methods are focused in our paper.In addition,the sparse models with interpretability use vector or matrix to reduce redundant information,preserve relevant information so that we integrate sparsity learning into design of feature selection method.In multi-label learning,sparsity-learning-based feature selection methods play an important role which can preserve relevant features and eliminate irrelevant or redundant features.Previous multi-label feature selection methods are designed based on the idea of single-label methods,which only consider feature matrix or label matrix.However,the information of data is described by two matrix,that is,the feature matrix and the label matrix.To this end,we propose a novel feature selection method named Feature Selection considering Shared Common Mode between features and labels(SCMFS).First,Coupled Matrix Factorization(CMF)is used to extract common mode between feature matrix and label matrix.Then,we use Non-negative Matrix Factorization(NMF)enhances sparsity and interpretability of the proposed SCMFS method.Finally,we design a simple yet effective optimization scheme with provable convergence.Moreover,all experiments are conducted on twelve real-world multi-label benchmark data sets on two classifiers,SVM and K-NN.The proposed method is compared with other five methods in terms of macro-average and micro-average.The experimentalresults demonstrate the superiority of the proposed method.In summary,the main contributions of this paper can be summarized as follows:1.Extracting the shared common mode that is the correlations between the feature matrix and the label matrix by taking advantage of Coupled Matrix Factorization.2.Introducing Non-negative Matrix Factorization that has an inherent cluster and interpretability property for selecting the most discriminative features.3.Proposing a novel multi-label feature selection method named Shared Common Mode between features and labels(SCMFS).4.Developing a method to solve the constrained optimization problem of our method SCMFS to ensure convergence.
Keywords/Search Tags:Machine Learning, Feature selection, Multi-label Learning, Sparse Learning, Classification, Non-negative Matrix Factorization, Coupled Matrix Factorization
PDF Full Text Request
Related items