Font Size: a A A

Online Multi-label Streaming Feature Selection Based On Label Correlation

Posted on:2024-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:F H BaoFull Text:PDF
GTID:2568307064455664Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of computer and communication technology,we are facing various types of data in our daily lives.Among them,multi-label data has been constantly emerging in different fields such as image recognition,information retrieval,and emotion recognition.Multi-label data refers to instances that involve multiple labels,and there is often some kind of relationship between the labels.People often infer the possible existence of other labels from one label,so how to apply this potential relationship to multilabel learning is an important issue that this article researches.Similar to single-label data,multi-label data often has a large number of features,but most of the features are usually redundant and irrelevant,which not only increases the computational burden but also reduces the classification performance of the model and can lead to the "curse of dimensionality." So it is important to apply a method to remove features from high dimensional features that are not helpful for marking classification.Consequently,feature selection,a technique that can effectively reduce feature dimensionality,has gradually attracted the attention of many scholars.It selects the initial optimal feature subset by eliminating irrelevant and redundant features.This article mainly focuses on multi-label learning and multi-label feature selection,including:(1)Most existing multi-label streaming feature selection algorithms often ignore the correlation between labels when selecting features,which leads to a decrease in the prediction accuracy of the algorithm.In order to solve this problem,an online multi-label streaming feature selection algorithm that combines neighborhood information and label correlation is proposed.First,an adaptive neighborhood relationship is defined to solve the problem of granularity selection of neighborhood rough set and is extended to multi-label learning.Then,label weights are obtained using mutual information to calculate the correlation between labels.Finally,the correlation between features and labels is evaluated by using neighborhood rough sets and label weights,and three indices are designed: online significance analysis,online relevant analysis and online redundancy analysis to evaluate dynamic candidate features.The experimental results on 7 multi-label datasets and 5 evaluation indicators show that the proposed algorithm has better overall performance.(2)Considering an application scenario where features streaming into the feature space in the form of groups rather than one by one,an online multi-label streaming feature selection algorithm based on group features is proposed.First,the correlation between each label is calculated to generate weighted directionless diagrams,then the weight of each label is obtained by using the correlation of labels.The feature selection is then divided into two parts:intra-group feature selection and inter-group feature selection.In the intra-group selection stage,the correlation between the feature and the single label is evaluated based on a certain metric method,considering label correlation,and the most representative feature is selected and retained.Then,in the inter-group feature selection,a multi-label group feature selection model based on label correlation is established by considering the group structure information between feature groups and the correlation and classification performance of the labels.Finally,experiments show that the proposed algorithm has better classification performance compared with the comparison algorithms.
Keywords/Search Tags:Multi-label data, Multi-label feature selection, Label correlation, Group feature, Streaming feature
PDF Full Text Request
Related items