Font Size: a A A

Research On Multi-label Classification Algorithm With Label Correlations

Posted on:2016-12-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z F HeFull Text:PDF
GTID:1108330488997646Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the traditional supervised learning framework, each object is assigned to only one label. However, in the real world, one object may be associated with multiple labels. For instance, an image may be labeled "ocean" and "water", an article may be tagged with "H7N9", "bird flu", "fever", "cough" and so on. Multi-label learning is a learning framework to learn such task and has been attracted a great deal of attention in machine learning, and how to effectively discover and exploit correlations among labels is the core research issue of multi-label learning. A series of multi-label learning algorithms by exploiting label correlations have been proposed and applied successfully in many application areas. Nevertheless, most of them only consider the second-order label correlations and some assume that the label correlations are symmetric, but the research works regarding the discovery and exploitation of the label correlations, especially the high-order asymmetric correlations among labels, are relatively few. Therefore, this dissertation focuses on the two aspects of exploiting the high-order asymmetric label correlations and automatically discovering and exploiting the label correlations by learning. The main contributions of this dissertation are as follows:1. We propose a two-step method for learning label correlations and multi-label classification (TMLC). In TMLC, the high-order asymmetric label correlation matrix is firstly computed by l1 sparse coding in the label space, and then a joint learning framework of multi-label classification and feature selection with label correlations is formulated. Experimental results on multi-label datasets verify the effectiveness of TMLC.2. A method named joint learning of label scatter and muti-label classification algorithm (JLSML) is proposed. In JLSML, a virtual label which is a bi-partition point between the relevant and irrelevant label sets of one instance is introduced. We construct a joint learning model of label scatter and multi-label classification with a virtual label, and the learning of label scatter, the training of classification model and the partitioning of label set can be learned simultaneously. The experimental results on multi-label datasets show the superiority of JLSML.3. Two approaches JMLLC and SLMLC are proposed, which can jointly learn the label correlations and multi-label classification. In this dissertation, we try to automatically discover and exploit the high-order asymmetric correlations among labels by learning, and the label correlations and multi-label classification can be learned simultaneously in a unified learning framework, two novel methods JMLLC and SLMLC are proposed. In JMLLC, the label correlation matrix and weight matrix can be learned simultaneously and two different loss functions (logistic regression and least squares) are chosen. In SLMLC, we assume that the weight matrix is the sum of a sparse matrix (obtaining the specific feature subsets that are relevant for each label) and a low-rank matrix (obtaining the feature subspace shared among all labels). The high-order asymmetric label correlation matrix, sparse matrix and low-rank matrix can be learned in a joint learning model simultaneously. The experiment results show the effectiveness of JMLLC and SLMLC.4. A novel algorithm named multi-label classification with missing labels and feature selection (MLMF) is introduced. In some real applications, it is more difficult to obtain all the true labels of each training instance while it is relatively easy to get the partial labels of each training instance (i.e. some labels are missing). Furthermore, most exisiting multi-label classification algorithms are generally incapable of dealing with both missing labels and label correlations. Thus, in this dissertation, we proposed a novel approach MLMF. A unified learning framework is formulated by considering missing labels and label correlations, and adding a regularization term of l2,1-norm of weight matrix for feature learning. Experimental results with full labels and missing labels demonstrate that MLMF achieves superior performance.
Keywords/Search Tags:supervised learning, multi-label learning, multi-label classification, label correlations, feature selection, sparse representation, low-rank representation, missing labels, l2,1-norm
PDF Full Text Request
Related items