Font Size: a A A

Graph-structured Description Of Multi-label Classification Problems And Researches On Several Learning Methods

Posted on:2016-05-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z J ChenFull Text:PDF
GTID:1108330503953314Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Multi-label learning assumes an instance can be annotated with more than one label and the predefined labels are interdependent. Such problems exist in many real applications, such as text categorization, semantic annotations of images, functional genomics, medical diagnosis, directed marketing and individual recommendation of electronic commerce, and have attracted much attention. Researches on multi-label learning mainly focus on learning methods, label correlation exploitation, problems with large number of labels, problems with weak labels and combining with other learning problems, etc. This thesis centers on the graph descriptions of multi-label classification problems and approaches, and several multi-label classification algorithms for discovering label correlations and dealing with large number of labels.(1) Considering that current researches seldom discuss the descriptions of multi-label classification problems and classification approaches, the thesis builds graph structures for the output label space and then presents graph-structured descriptions for some typical multi-label classification approaches. From two different views, the thesis respectively constructs a hasse and a hypercube on the label space, which can better describe the characteristics of the structured output space of multi-label classification problems, and then verifies that the hasse and the hypercube are isomorphic and possess some good properties. Furthermore, based on the hypercube, the thesis gives detailed graph-structured descriptions of some typical multi-label classification approaches in order to reflect the common characteristics of different multi-label classification approaches from a new view.(2) Within the graph-structured descriptions of multi-label classification approaches, the thesis proposes several learning algorithms to deal with label latent correlations and large number of labels.① Considering that it is very important to model data correlations and dimensionality reduction in multi-label learning, the thesis discusses two multi-label classification algorithms based on Canonical Correlation Analysis(CCA). The first one ‘ML-CCA’ uses CCA to discover correlations between the sample set and the label set, and extract valuable features for predictive modeling. While the second one ‘CCA_LSDR’ focuses on problems with large number of labels. Through revising the original optimization problem of CCA, CCA_LSDR restricts the mapping on the label space to be orthogonal, obtains a feature-aware label space dimensionality reduction and finally recovers the predicted label set through decoding. Experiments on six datasets with large number of labels demonstrate that both ML-CCA and CCA_LSDR can guarantee learning performance by exploiting correlations between the sample set and the label set. And compared to ML-CCA, CCA_LSDR predicts much faster by label space dimensionality reduction. Besides, CCA_LSDR has faster training speed compared to other label space dimensionality reduction methods.② Considering that the learning process of class embedding, which helps discover latent correlations among labels, seldom combines with sample features and learning errors, the thesis presents a unified framework ML-SLDE/DML-SLDE with supervised low-dimensional embedding. The framework balances the importance of correlation modeling and learning accuracy, and integrates underlying structure exploration and the empirical risk into the predictive model. Inspired by the spirit of latent semantic analysis, the supervised low-dimensional embedding can discover latent semantics and correlations among samples, among labels and between samples and labels, and extract valuable low-dimensional features for predictive modeling. Meanwhile, coefficients of the predictive function are decided directly by the obtained low-dimensional mapping. In order to verify the performance of the framework ML-SLDE/DML-SLDE, extensive experiments on ten benchmark datasets evaluate the performance of comparative algorithms with respect to accuracy, training complexity and sensitivity to critical parameters. Results show that ML-SLDE/DML-SLDE can well deal with different kinds of multi-label datasets and achieve the best or second best results with respect to several measures. And the superiority of ML-SLDE/DML-SLDE in training cost is significant on most datasets. Besides, through sensitivity analysis, ML-SLDE/DML-SLDE is shown to be more robust to the size of the low-dimensional subspace.③ Considering that data from the real world is usually collected in an incremental manner, the thesis further discusses the incremental revisions of the proposed framework ML-SLDE/DML-SLDE, aiming to adopt historical information as much as possible and promote the efficiency of incremental modeling. Firstly, the thesis gives simple incremental deforms respectively on the primal form and dual form of the original framework and proposes ML-SLDE_I and DML-SLDE_I. Experiments on 7 datasets of different sizes verify that ML-SLDE_I can maintain very good performance and train several times faster than ML-SLDE which is without incremental learning. And the superiority of ML-SLDE_I will be much more remarkable as the size of collected old samples is increasing. Furthermore, the thesis proposes an incremental learning formulation based on incremental singular value decomposition. The formulation is built on the SVD based algorithm to the primal form ML-SLDE of the unified framewrok and can avoid some possible problems when needing to transfer between the primal form ML-SLDE_I and dual form DML-SLDE_I in the process of sample increment. Through the analysis of computational cost, the incremental SVD based formulation is expected to exert more advantages of incremental learning in the high-dimensional low-rank cases.
Keywords/Search Tags:Multi-label classification, Incremental learning, Graph-structured description, Label correlation, Label space dimensionality reduction
PDF Full Text Request
Related items