Font Size: a A A

Research On Multi-label Feature Selection Algorithm Based On Sparse Learning

Posted on:2021-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z LingFull Text:PDF
GTID:2428330620472180Subject:Engineering
Abstract/Summary:PDF Full Text Request
We are now in the era of big data,and a large number of high-dimensional data are widely distributed in all areas of human life.Also,these data generally have rich semantics.Multi-label learning deals with training examples each often represented by an instance?feature vector?while associated more than one label.In multi-label learning,we call these data multi-label data.When machine learning and data mining technology are applied to these high-dimensional multi-label data,an important problem is known as the curse of dimensionality.Feature selection is considered to be one of the most powerful tools to solve this problem.In the past few years,multi-label feature selection has attracted increasing attention of many researchers,and some algorithms have shown that multi-label feature selection can reduce dimensionality effectively.However,they still have some problems that are difficult to solve:?1?in order to select features,the existing feature selection algorithms usually adopt one of these two strategies: select a subset of features which is shared by all labels?common features?or select features that are discriminative to each label?label-specific features?.However,both of them may play an important role in the discrimination process.They are very important for the discrimination ability of the selected features;?2?It is significant to explore and exploit label correlations in feature selection.Although the existing algorithms have achieved good results,it is necessary to explore new methods to further improve the performance of the algorithm.In addition,the existing algorithms usually use global label correlation.However,label correlations are usually local and shared by local regions of the dataset;?3?The existing algorithms are usually based on the original label information of multi-label data,but it can not fully describe the rich semantics of the instance.On the onehand,because the related labels usually have different contributions to describe the instance,the label importances are usually different.On the other hand,label importance cannot be obtained directly from training samples.Based on the above observations,this paper proposes two novel and efficient multi-label feature selection algorithm to solve these above problems.To address problems?1?and?2?,we proposed a novel multi-label feature selection framework.Specially,common and label-specific features are simultaneously considered by introducing both l2,1-norm and l1-norm regularizers,local label correlations are automatically learned with probability and learned correlation information is efficiently exploited to help feature selection by constraining label correlations on the output of labels.A comparative study with seven state-of-the-art methods manifests the efficacy of our framework.To deal with the problem?3?,a novel approach is proposed which aims to enrich the original label information and then learn common and label-specific features with the enriched label information.Specifically,to enrich label information,the manifold structure of feature space is exploited to transform the original categorical labels into numerical ones.After that,we employ enriched label information to steer feature selection.Extensive experiments clearly validate the superiority of our proposed approach.
Keywords/Search Tags:Multi-label learning, Feature selection, Label enhancement, Local label correlations, Label-specific
PDF Full Text Request
Related items