Font Size: a A A

Research On Algorithm Of Feature Selection With Fuzzy Discernibility Matrix For Multi-label Classification

Posted on:2024-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Y MiFull Text:PDF
GTID:2568306941953809Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In the previous machine learning framework,a sample object usually contains only one category label or output value.With the increasing complexity of learning scenarios,examples of data objects associated with multiple binary semantic output values are becoming more common.Traditional supervised learning algorithms are difficult to predict multiple labels associated with a sample at the same time.Multi-label learning algorithms that have emerged in recent years can efficiently process complex data containing multiple semantic output information.However,both traditional supervised learning and multilabel learning ignore many problems in the real world,such as the different weights of different label variables in the output space.In addition,the input space of multi-label data usually contains a large number of feature representation dimensions,so feature selection is also an important learning task in multi-label classification.Therefore,on the one hand,this paper attempts to enhance the information of multi-label data by considering the importance of different labels to each sample,so as to mine richer and more general semantic information in multi-label datasets and improve the classification performance of multi-label learning.In order to effectively avoid the curse of dimensionality,on the other hand,feature selection is performed for label-enhanced multi-label data,aiming to improve the learning performance and efficiency of multi-label classification algorithms.Facing the high-dimensional complex real data collected in the real world,the fuzzy rough set theory can efficiently quantify the uncertainty information between the input features and the output without any prior knowledge,and extract the key features that has an indispensable role in the labeling.In order to mine more potential semantic information,this paper transforms logical labels of samples into label distribution data,and proposes a multi-label feature selection algorithm based on fuzzy rough set theory,so as to improve the effect of multi-label learning.Firstly,based on fuzzy C-means clustering algorithm(FCM),this paper constructs a new membership function based on fuzzy covariance matrix and possible values,and uses the membership function to transform multi-label data into label distribution data.For label distribution data,this paper calculates the importance of features based on fuzzy rough set theory,and constructs a multi-label feature selection algorithm for label distribution by using feature importance.The proposed algorithm is compared with common multi-label feature selection algorithms on eight public datasets.Through the analysis of five commonly used evaluation indexes in multi-label learning,the experimental results show the effectiveness and feasibility of the proposed algorithm.
Keywords/Search Tags:Multi-label learning, label distribution, feature selection, rough set
PDF Full Text Request
Related items