The Sparse Methods For Multi-Label Classification

Posted on:2015-05-15

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Ma

Full Text:PDF

GTID:2298330431493442

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Classification is one of the hot researches in data mining. In the field of traditional classification, each instance is assumed to belong to one class label. However, in the real application, each instance could be associated with multiple labels. For example, a news reporting Brazil’s World Cup can be labeled as "sports meet","football" and "Brazil". According to different purposes, a computer has many functions such as "video and audio","scientific research" and "shopping online". These problems are called multi-label problems. Multi-label classification has been widely applied to many fields such as text classification, information retrieval and bioinformatics. However, there are more challenges in multi-label classification than traditional one. Firstly, in multi-label classification, a set of labels are not independent from each other, and there are some correlations among them. How to measure and capture the correlations in the label space for improving prediction is an open issue. Furthermore, similar to traditional single-label classification, multi-label classification also suffers from high-dimensional data. The high dimensionality of data exists in not only instance space, but also label space. Particularly, with the increase of the number of labels, the space of label variables often becomes sparse. This has brought both challenges and opportunities to multi-label learning.Specific to these challenges existing in multi-label learning, this thesis proposes three algorithms based on the improvement of different kinds of partial least squares regression (PLSR) models. Theoretical analysis and simulation experiments show that the three algorithms can obtain effective results of classification.Due to singular value decomposition (SVD) can extract the important information of matrix space, we propose an algorithm for multi-label classification called SPMD. SPMD can perform dimension reduction and regression analysis for multi-label data simultaneous. Firstly, the labels are taken as a whole to exploit the label correlation, and then the score vectors of instance space and label space are computed by SVD. Finally, based on PLSR, the classification model for multi-label is constructed.Due to that Ridge regression can handle the multi-collinearity problems, we present an algorithm for multi-label classification named RPLS-DA, where DA means discriminant analysis. An l2-norm penalization is exerted on PLS-DA to tackle with the problem of "large p, small n" caused by high-dimensional data.We improve the Nonlinear Iterative Partial Least Squares algorithm (NIPALS) by the sparse model named LASSO, and propose an algorithm for multi-label classification called LNMD. LNMD aims at performing dimension reduction and feature selection at the same time, and then the label correlations are considered to design the classification model for multi-label data. Furthermore, LNMD is a new sparse method for dimension reduction.

Keywords/Search Tags:

multi-label classification, SVD, ridge regression, sparse learning

PDF Full Text Request

Related items

1	Study Of Classification Problems Based On Sparse Representation And Ensemble Learning
2	Research On Multi-label Classification Algorithm With Label Correlations
3	Research On Acquisition And Application Of Label Correlation In Multi-label Learning
4	Multi-label Prediction Model Based On Ontology Database And Data Mining In Bio-medicine
5	Research On Multi-label Classification Algorithms Based On Samples And Property Analysis
6	Multi-label Feature Selection Based On Manifold Learning And Sparse Regression
7	Research On Multi-label Classification Algorithm Based On Label Relationship
8	The Research And Application Of Classification Method On MTS Based On Ridge Estimaiton
9	Multi-label Classification Of Complex Scene Based On The Incremental Learning And Deep Sparse Filtering
10	Research On Multi-label Classification With Incomplete Label Information