Study On Large-scale Multi-label Learning

Posted on:2019-02-04

Degree:Master

Type:Thesis

Country:China

Candidate:W J Zhang

Full Text:PDF

GTID:2428330566460777

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Large-scale multi-label learning(LMLL)aims to learn classifiers that can automat-ically annotate a data point with the most relevant subset of labels from an extremely large label set.It has been widely used in many applications such as tagging,ranking,recommendation and etc.Hence,large-scale multi-label learning has drawn considerable attention for its practical importance.The main challenge is that both the data feature space and the label space have extremely high dimensionalities and sparsities.It involves 2L possible label sets especially when the label dimension L is huge,e.g.,in millions for Wikipedia labels.This paper proposes two novel methods which simultaneously exploits semantical label correlations and establishs nonlinear feature embedding.Experimental results on several benchmark datasets demonstrate the effectiveness and efficiency of our methods.Main contributions of this paper are as follows:? This paper presents an efficient large-scale multi-label learning method(CoMFM).The method consists of two innovations:?)We present a novel collaborative label embedding algorithm of exploiting semantical label correlations by using collabo-rative filtering techniques on the label co-occurrence matrix,instead of the training label matrix,and then obtains the low-dimensional latent representations for all la-bels.?)To the best of our knowledge,this is first work that combining high-oder feature correlations and label correlations simultaneously for LMLL.Specially,for learning high-order nonlinear feature embeddings,we extend vanilla factorization machine to multi-output fashion.? This paper presents a deep learning based large-scale multi-label learning method(DXML).The method also consists of two innovations:?)We present a novel deep label graph embedding algorithm to learn the low-dimensional representations for all labels,to the best of our knowledge,this is the first work to introduce explicit label graph structure into the LMLL.?)We present a nonlinear feature embedding by using deep neural network,this is an early work for adapting deep learning to the LMLL setting.? Experimental results on several benchmark datasets confirm that:?)CoMFM per-forms competitively against state-of-the-art with less computation costs,surpris-ingly 10-120x faster than the recent embedding-based methods with similar accu-racy;ii)DXML outperforms all embedding-based methods.

Keywords/Search Tags:

Large-scale Multi-label Learning, Deep Learning, Label Graph, Multioutput Factorization Machine, Collaborative Filtering

PDF Full Text Request

Related items

1	Study On Multi-label Learning Methods Based On Non-negative Matrix Factorization And Extreme Learning Machine
2	Research On Feature Selection Method Based On Multi-label Learning Theories
3	Imbalanced Multi-label Learning Algorithm Based On Density Label Space
4	Research On Machine Learning Algorithms For Data With Multiple Annotations
5	The Research Of Machine Learning Methods Based On Label Distribution Learning
6	Research On Deep Learning Based Multi-Label Image Learning Algorithm And Application
7	Contributions To Several Issues Of Multi-Label Learning
8	Multi-label Learning Based On Label Weight And Weighted Kernel Extreme Learning Machine
9	Research On Label Embedding In Ambiguous Machine Learning
10	Research On Multi-label Data Classification Technology