Font Size: a A A

Feature Dimension Reduction Method For Multi-label Learning

Posted on:2018-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:B F ZhouFull Text:PDF
GTID:2348330518950739Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In multi-label learning,the output is more than one but a multiple and each label is not independent because each sample of multi-label data contains multiple labels.The dimension of multi-label data are usually high,which increases the complexity and difficulty of data mining.In recent years,how to deal with multi-label data efficiently has become a hot research topic.Feature reduction can reduce the dimension and downsize the scale of multi-label data,and improve the performance of multi-label learning.Therefore,in this thesis,we propose two algorithms to reduce the dimensionality of multi-label data for multi-label learning.(1)The first algorithm is multi-label learning feature reduction based on principal component analysis(MLFR-PCA).Firstly,in this algorithm,the original data is projected to the low dimensional subspace by using the PCA principle,so as to ensure that the samples are dense and denoised.Secondly,the algorithm takes all the labels as a whole,and introduces the sparse regression between the labels and the features.So the algorithm establishes the relationship between the label space and the feature space,and constructs the objective function of dimensionality reduction.Thirdly,we combine the 2,1l-norm to optimize the algorithm.Finally,we can achieve the goal of reducing the dimension of multi-label data.(2)Multi-label learning feature reduction algorithm based on nonnegative matrix factorization(MLFR-NMF).Firstly,the algorithm represents similarity matrix of feature space by the product of the original data matrix and a nonnegative matrix.Secondly,the algorithm takes all the labels as a whole,and constructs the similar matrix of label space by the existing methods.Thirdly,we introduce the least square method,it can be established between the similarity matrix of feature space and the label space,so as to obtain the objective function of data dimensionality reduction.Finally,combine the 2l-norm to optimize the algorithm,and achieve the goal of reducing the dimension of multi-label data.The aforementioned two algorithms of dimensionality reduction can reduce dimension directly,and do not need to transform the multi-label data into the single-label data.These methods not only reduce the workload increase caused by the conversion process,but also avoid the subsequent problems caused by inaccurate conversion.In addition,the algorithm takes all the labels as a whole to participate in the construction of the objective function,so that it can effectively reduce the dimension of the features without destroying the label structure.Through a lot of experiments on real data sets,it is proved that the two algorithms are effective.
Keywords/Search Tags:multi-label learning, feature reduction, principal component analysis, nonnegative matrix factorization, similarity matrix
PDF Full Text Request
Related items