Font Size: a A A

Streaming Multi-label Feature And Label Specific-feature Selection Algorithm Based On Mutual Information

Posted on:2022-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ChenFull Text:PDF
GTID:2518306485950139Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of computer science and technology,we are confronted with various kinds of data in our life.Many of these data are described by multi-label at the same time,which is called multi-label data and is the object of multi-label learning tasks.Multi-label learning is to build a classification model based on a large number of existing multi-label data.After training,the model can label the unknown data.Multi-label learning plays a role in many real-world applications,such as image recognition,text classification,audio recognition,and so on.However,multi-label data often has a large number of features,and the high dimension of features can easily lead to "dimension disaster".The dimension disaster will badly influence the classification performance of the model.Therefore,as an effective technology to improve the performance of the model,feature selection has been valued by scholars.This paper focuses on multi-label learning and multi-label feature selection.The main research contents are as follows:(1)In a specific application application,features will enter the model in chronological order(i.e.streaming features).In order to select feature from streaming features for multi-label learning,we propose a multi-label streaming feature selection algorithm based on neighborhood interaction gain information.Firstly,we define the neighborhood interaction gain based on neighborhood mutual information,which is used to measure the relationship between features and selected subset.Secondly,online correlation analysis and online redundancy analysis are used to evaluate the streaming features.Thirdly,according to neighborhood mutual information,we build a objective function and propose the process of feature selection.Finally,experimental results on six multi-label datasets and four criteria demonstrate the effectiveness and stability of the algorithm.(2)By using the knowledge of label-specific features,we proposed a multi-label-specific feature selection algorithm based on mutual information.Firstly,a basic optimization framework is constructed to learn the feature weight matrix.Secondly,we define the feature importance metric by mutual information and Pearson correlation coefficient.The feature importance metric will help us to learn weight matrix which show the positive and negative correlation between features and labels.Thirdly,We assume that two strongly correlated class labels can share more features with each other than two uncorrelated or weakly correlated ones.The optimization framework can be used not only as a classifier,but also as a basis for feature selection.Finally,experiment results show the effectiveness and stability of the algorithm.
Keywords/Search Tags:multi-label learning, multi-label feature selection, streaming feature selection, label-specific features, mutual information
PDF Full Text Request
Related items