Font Size: a A A

Research On Multi-label Feature Selection Based On Weighted Labels And Consistent Neighborhood

Posted on:2022-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:S LuFull Text:PDF
GTID:2518306773467974Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the continuous developing of big data era,many kinds of data will be obtained in various fields.Many of the data are described by multiple labels at the same time,and this kind of data is called multi-label data.Multi-label learning uses an existing multi-label dataset to train a machine learning model that assigns a set of labels to objects to be classified.Multi-label learning plays an important role in many application fields,such as medical diagnosis,text classification,image recognition,applied finance and so on.However,the existing multi-label data has thousands of features,many of which are redundant and irrelevant.The large dimensionality of features leads to the "curse of dimensionality".The curse of dimensionality affects the classification performance of machine learning models.Feature selection is to select the most important features from the original features to reduce the number of features,which can effectively improve the performance of the classification model.Therefore,feature selection has received extensive attention from scholars.This paper focuses on multi-label learning and multi-label feature selection.The main research contents are as follows:(1)Multi-label feature selection based on weighted labels.In multi-label learning,different labels have different discriminative characteristics for samples.Based on this,a multi-label feature selection algorithm based on weighted labels is proposed.First,the classification interval of the sample in the feature space is used as the weight of the class label;secondly,calculate the average classification interval of the samples under the weight of the fusion label,and use it to construct the feature subset evaluation function,and then use the distinguishability of the feature to the sample as the feature weight,so as to measure the importance of different features to the label;thirdly,the features are sorted in descending order by the feature weight.A new set of feature rankings can be obtained.A series of experimental results show that the proposed algorithm has great advantages over other multi-label feature selection algorithms.(2)Multi-label feature selection based on multi-granularity consistent neighborhood.In view of the fact that traditional multi-label feature selection algorithms rarely define the neighborhood of samples in labels space and feature space,this paper designs a corresponding multi-label feature selection algorithm based on multi-granularity neighborhood consistency.Firstly,all samples are granulated by using the neighborhood consistency of label space and feature space.Moreover,new multi-label neighborhood information entropy and multi-label neighborhood mutual information are defined based on the view of multi-granularity neighborhood consistency.Secondly,an objective function is constructed to evaluate the quality of candidate features based on multi-label new neighborhood mutual information,which is used to evaluate the importance of each feature.The effectiveness of the proposed algorithm is verified by several measure criteria.
Keywords/Search Tags:multi-label learning, multi-label feature selection, weighted labels, multi-granularity, neighborhood consistency
PDF Full Text Request
Related items