Font Size: a A A

Multi-label Feature Selection Method In The Context Of Missing Labels

Posted on:2022-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ZhangFull Text:PDF
GTID:2518306773467914Subject:Tourism
Abstract/Summary:PDF Full Text Request
In real life,data exists in all aspects,and multiple labels tag these data simultaneously.Such data is named multi-label data.With the development of information science,the feature scale of multi-label data shows an explosive growth,which makes traditional multi-label learning facing severe challenges.Feature selection can effectively solve various problems caused by high-dimensional data.With this characteristic,it has become a common data preprocessing method recently.The previous multi-label feature selection algorithms assumed that the label space was complete and pre-acquired when constructed the algorithm model.However,in the actual task of supervised learning,there are missing labels in the label space,and exists the situation that labels flow into the label space dynamically in the form of stream.Therefore,many researchers build multi-label feature selection algorithm models under the circumstances of missing labels and streaming missing labels have broadened application value and practical significance.This thesis studies the multi-label feature selection algorithm in the circumstances of missing labels and streaming missing labels.The principal study work is as follows:(1)Completing missing labels is an effective way to deal with missing label circumstance.In order to achieve the purpose of label completion,the label-specific features are selected from the original feature space for each class label,and the missing labels are completed by constructing the correlation between the label-specific features and labels.Based on this idea,we propose a multi-label feature selection based on label-specific feature with missing labels(MFSLML).Firstly,the label-specific features of each class label are obtained according to the sparse learning strategy;Secondly,we use a linear regression function to model the correlation between each class label and its label-specific features,which can recover missing labels;Then,the features are selected based on the label-specific features;Finally,we verify the effectiveness of the proposed algorithm through comparative experiments.(2)At present,the scenario of labels arriving dynamically over time exists in many fields.However,existing multi-label feature selection algorithms with missing labels ignore this problem.In this regard,for the work in(1),we further consider the dynamic stream label scenario,and propose a multi-label feature selection with streaming and missing labels(MFSSML).Firstly,we learn the correlation between the arrived labels;Secondly,the missing labels are completed using the label correlations;Then,a subset of features is selected by calculating the feature score through the label-specific features;Finally,our experiments show that the proposed algorithm has better classification performance compared with the comparison algorithms.
Keywords/Search Tags:multi-label data, multi-label feature selection, label-specific feature, missing label, streaming label
PDF Full Text Request
Related items