Font Size: a A A

Multi-label Classification Research Based On Label-specific Features And Label Correlation

Posted on:2020-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:F LiFull Text:PDF
GTID:2428330602452471Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology and the arrival of the era of big data,a large number of multi-label data sets have been generated in different fields,and the scale of data sets is also growing.How to effectively mine or learn such large-scale data to get valuable information is an urgent problem.For multi-label learning,it has three main features:(1)each data sample in the training set corresponds to a label set consisting of multiple labels,and there is an association between the label and the label;(2)for each label,the label-specific feature can obtain more information of the label,that is,enrich the information of the label;(3)there is a problem of class imbalance in multi-label classification learning,that is,multi-label data sets are not evenly distributed.Based on the above characteristics,this paper studies the multi-label dataset.The main work is as follows:Based on label features and label correlation,this paper proposes a multi-label classification algorithm,LP-LFLC.The basic idea of this algorithm is as follows: For each label,firstly,a feature mapping function is constructed by clustering correlation technology and distance formula.Then,the original data feature space is transformed into a specific feature space to get the label feature set of each label,which enriches the label information.Secondly,the case-based nearest neighbor method is used to expand the label feature set of each label by using the correlation between labels.Finally,using MATLAB software and other six classical algorithms to carry out experimental simulation on eight common data sets,the experimental results confirm that the LP-LFLC algorithm has better classification performance.Considering how to better improve the class imbalance and more effectively use the label-specific features and label correlation,this paper combines the basic ideas in the LP-LFLC algorithm.combined with the basic ideas in the LP-LFLC algorithm.Based on the improvement of this algorithm,a new multi-label classification algorithm LSFLC is proposed,which can more effectively integrate label-specific features and label correlation to build a classification model or classifier.The main process of LSFLC algorithm is: firstly,for each label,we iteratively generate a new positive class instance by re-sampling technique a positive class instance set of the extended label.Secondly,we transform the original data feature space into a label-specific feature space by using the feature mapping function constructed in LP-LFLC,and get the label feature set of each label.Then,for each label,find the label that is most positively related by constructing the co-occurrence matrix,and then copy the positive class instance of the label to expand its label feature set.Finally,in the experimental part,compared with LP-LFLC algorithm and several other classical multi-label classification algorithms,experiments are carried out on eight different data sets.Experimental results confirm that the LSFLC classification algorithm has better classification performance.In this paper,two multi-label classification algorithms based on label features and correlations are proposed.The effectiveness of the proposed algorithm is proved by theoretical analysis and experimental simulation.
Keywords/Search Tags:Multi-label classification, Local label correlation, Label-specific features, Class imbalance, Supplementation of related examples
PDF Full Text Request
Related items