Research On The Dimensionality Reduction And Classification Algorithms In Multi-label Learning

Posted on:2015-07-01

Degree:Master

Type:Thesis

Country:China

Candidate:K Yan

Full Text:PDF

GTID:2298330422472098

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Multi-label learning comes from text classification, and many real-world problemsbased on machine learning fall into the category of multi-label learning. Different fromtraditional supervised learning methods which assume that each instance is associatedwith only one class label, one instance in multi-label learning usually belongs tomultiple labels simultaneously. Numerous original features should be sampled toenhance the accuracy of multi-label learning, which results in ‘curse of dimensionality’problem. The accuracy of learning algorithms will be severely degenerated due to thisproblem. Thus, how to obtain effective low-dimensional data from high-dimensionalspace performs a significant role in enhancing the accuracy of classification. For theproblems of multi-label classification, the main contributions of this thesis aresummarized as follows:(1) Multi-label learning, dimensionality reduction methods and manifold learningalgorithms are introduced. Manifold learning algorithms can preservegeometricstructure of the local patches when high-dimensional data is mapped to alow-dimensional space. However, the number of neighbors of locally linear embeddingalgorithm is fixed, which could not avoids smoothing or eliminating elimination ofsmall-scale structure as well as false division of dividing the continuous manifold intoirrelevant sub-manifolds. Thus how to select the number of neighbors is significant.(2) The condition in multi-label learning, which could not use the latentinformation of unlabeled data, may degenerate the accuracy of multi-label learning. Inreal scenario, only few high-dimensional multi-label data are labeled. In order toeliminate the redundant feature effectively and use the latent information of unlabeleddata, and obtain the low-dimensional manifold structure, semi-supervised learningmethods should be adopted. To make full use of the supervised information of labeledinstance and the statistics information of unlabeled instance, and to calculate the propernumber of neighbors, an effective dimensionality reduction algorithm named VariableK-Nearest Semi-Supervised Locally Linear Embedding (VKSSLLE) is proposed.(3) In order to enhance the accuracy of multi-label learning, an effectivemulti-label classification algorithm named Variable K-Nearest Semi-Supervised LocallyLinear Embedding-Naive Bayes Classifier (VKSSLEE-NBC) is proposed, which adoptVKSSLLE algorithm to obtain the low-dimensional manifold structure embedded in high-dimensional space, and adopt naive Bayes classifier to implement multi-labelclassification. Different dimensionality reduction algorithms are respectivelyincorporated with multi-label naive Bayes classifier respectively to solve multi-labellearning problem. Experimental results on artificial dataset and two real-world datasetsshow that VKSSLLE_NBC algorithm can effectively enhance the accuracy ofmulti-label learning.

Keywords/Search Tags:

Multi-label classification, Multi-label dimensionality reduction, NaiveBayes classifier, Manifold learning, Semi-supervised learning

PDF Full Text Request

Related items

1	Semi-Supervised Dimensionality Reduction And Ensemble Learning For Multi-label Classification
2	Research On Key Technologies For Multi-instance Multi-label Web Page Categorization
3	Research On Multi-label Classification With Incomplete Label Information
4	Research And Application Of Multi-label Learning Algorithm
5	Multi-label Image Classification Techniques Based On Semi-supervised Learning
6	Research On Several Key Issues Of Multi-label Learning For Limited Supervised Information
7	AutoLink Semi-supervised Multi-label Study Of Literature Research And Implementation Methods
8	Research On Multi-label Classification Related Technology
9	Research On The Multi-label Feature Selection And Classification Methods With The Label Correlations
10	Research On Weakly-supervised Classification Methods Based On Samples And Labels Modeling