Font Size: a A A

Multi-label Learning Based On Dimensionality Reduction

Posted on:2019-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:S S LiFull Text:PDF
GTID:2428330551960986Subject:Statistics
Abstract/Summary:PDF Full Text Request
Multi-label learning is a hot topic in current machine learning field.In practical application,it is necessary for multi-label learning to collect a large number of features in order to improve it's classification performance.However,too many features may cause the dimension disaster and the classification problem.Therefore,how to reduce high dimension data effectively has great significance to improve classification accuracy.In addition,most existing dimensionality reduction algorithms rely on the dependencies between features to assess the quality of features,and rarely use the degree of similarity between the feature sets as the criteria for measuring features quality.Based on this,two multi-label dimensionality reduction algorithms are proposed in this thesis.The main contents are as follows:1.The discriminative embedded clustering(DEC)algorithm is a integrated framework of dimensionality reduction and clustering algorithms.Considering the effectiveness of this method in data dimensionality reduction,we apply DEC algorithm to multi-label dimensionality reduction,and propose a multi-label learning method based on DEC algorithm.The algorithm combines the algorithm of subspace learning and clustering,which avoids the problem effectively that other dimension reduction algorithms can not solve because of matrix singularity.Experimental results are compared with five widely used dimensionality reduction algorithms.The results show that multi-label data dimensionality reduction based on DEC algorithm is feasible,and it can improve the performance of multi-label classification effectively.2.Although DEC algorithm is effective for dimensionality reduction of multilabel data,on the one hand,this algorithm fails to take full account of the correlation between the feature and label sets,on the other hand,it neglects the correlation between the feature sets.Therefore,in view of the shortcomings of the DEC algorithm and other existing algorithm,an improved multi-label feature selection algorithm based on mutual information is proposed in this thesis.First of all,we use the intersection similarity to calculate the similarity between the feature sets,so as to eliminate the redundant features;then use the mutual information between the feature and label sets to extractthe relevant feature sequences;finally combined these two kinds of thoughts effectively,and then use the balance parameter ? to control the weight between two items,so as to choose the feature sequences that are most relevant to class labels and have minimal redundancy with other features.Based on the experimental results of eight public data sets,the effectiveness of the algorithm is proved.
Keywords/Search Tags:feature dimensionality reduction, the Discriminative Embedded Clustering algorithm, intersection similarity, mutual information, the performance of classification
PDF Full Text Request
Related items