Research On The Key Problems Of Multi-Output Classification And Its Extensions

Posted on:2020-03-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z C Ma

Full Text:PDF

GTID:1488306494469894

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Multi-output classification aims to predict multiple outputs for an input,where the output values are characterized by diverse data types,such as binary,nominal and ordinal.Such learning tasks arise in a variety of real-world applications,ranging from document classification,computer emulation,sensor network analysis,information retrieval,to video analysis,image annotation,gene function prediction and brain science.In recent years,this learning paradigm has received extensive attentions in the field of machine learning.In particular,a workshop was held specifically for the learning paradigm at the Asian Conference on Machine Learning(ACML)in Beijing in November 2018.Nevertheless,existing researches mainly focus on supervised multi-output classification tasks(multi-class classification,multi-label classification)with simple output structures,and less consider the real-world applications often involving diverse output space structures and incomplete output information,making it difficult to meet the needs of complex multi-output classification tasks.To overcome these drawbacks,this work attempts to comprehensively consider the various output space structures characterized by various variables,and to directly face more complex(supervised and weakly supervised)multi-output classification tasks,so it can meet the increasingly urgent real demands.The main contributions of this paper are as follows:(1)We propose a novel method,i.e.,SparseSBLR(sparse similarity-based multi-label learning by logistic regression),for a special multi-output classification task,whose output space is characterized by multiple binary variables or single nominal variable.Specifically,SparseSBLR unites distance-based learning and generalized linear models to achieve the best of both worlds.Each learned parameter of the model can reveal the contribution of one class to another,providing interpretability to some extent.Moreover,since most classes are irrelevant,this method can also obtain sparse output space structure and thus prevent from impairing performance of irrelevant classes.Experimental results show the effectiveness of the proposed method on such multi-output classification problems.(2)We propose a novel method,i.e.,MLKT-gMML-I(combining muti-label like transformation approach with iterative geometric mean metric learning),for a special multi-output classification task,whose output space is characterized by multiple nominal variables.Because the output space of this task often involves structures not only among class variables but also among their class values,the output space structure of such multi-output classification tasks is complex.In order to model such an output space structure,this paper adopts the strategy of first transforming the output space and then learning in the transformed space.Accordingly,we propose a novel method,i.e.,MLKT-gMMLI,whose contributions to such multi-output classification tasks mainly lie in two aspects.Firstly,the proposed transforming method equivalently converts the output space characterized by multiple nominal variables into an output space characterized by constrained multiple binary variables.Secondly,for learning the transformed problem,a metric-based learning model is proposed.This model has a closed-form solution and thus accelerates the training process.Secondly,the metric-based learning model proposed for learning conversion problem has a closed solution,which accelerates the training process of the model.Moreover,the learned metric can indirectly reflect the output space structure of the original multi-output classification task.The learned metric can indirectly reveal the output space structure of the original multi-output classification task.Extensive experiments show the effectiveness of the proposed method(3)We propose a novel learning framework,i.e.,ConMOOC(convex multiple ordinal output classification),for a special multi-output classification task,whose output space is characterized by multiple ordinal variables.The output space structures to be learned not only involve the relationships among multiple ordinal variables but also their discrete ordinal values.This paper proposes an effective formulation to address the above challenging problems.Under this formulation,the objective function is convex.Specifically,we use an effective threshold-based loss function to fit their ordinal values and a regularization formulation to model the relationships among multiple ordinal variables Experimental analysis shows that ConMOOC can not only obtain effective classification performance,but also obtain explicit relationships among variables(4)We propose a novel method,i.e.,LS-eScrm+LP(combining a least squares estimated structured conditional risk minimization approach and an inference method via using linear programming),for a special multi-output classification task,whose output space is characterized by diverse heterogeneous variables.To the best of our knowledge,this task is studied for the first time.Such multi-output classification tasks are far more complex than the above ones,in which the output space structure involves the relationship not only among diverse class variables,but also among the class values of diverse class variables.In order to model such complex output space structure,this paper also adopts the strategy of first transforming the output space and then learning in the new space The resulting method LS-eScrm+LP contributes to such tasks lied in two aspects.Firstly,the proposed transforming method LP equivalently converts the complex output space characterized by multiple heterogeneous variables into the output space characterized by multiple homogeneous binary variables with constrains.Secondly,for learning the transformed problem,the proposed structured prediction method LS-eScrm can not only keep the predicting efficient,but also embed the implicit structure among class variables into model learning to improve classification accuracy.Extensive experiments show the effectiveness of LS-eScrm+LP on such multi-output classification tasks(5)We propose a novel method,i.e.,DM2L(discriminant multi-label learning),for a special weakly supervised multi-output classification task with incomplete output information,the output space of this task is characterized by multiple binary variables.In reality,there are a large number of weakly supervised multi-output classification problems with incomplete output information We develop an effective yet simple method DM2L.Specifically,we impose on the one hand a low-rank structure for the predictions of instances from the same labels,and on the other hand a maximally separated structure for the predictions of instances from different labels.In this way,these low-rank structures can help both modeling local and global label structures and recovering missing labels.Our theoretical analysis subsequently supports the results.Moreover,the maximally separated structure can provide more underlying discriminant information.Extensive experiments demonstrate the effectiveness of the proposed approach.

Keywords/Search Tags:

machine learning, multi-output classification, supervised learning, weakly supervised learning, output space structure, ordinal variable, binary variable, nominal variable

PDF Full Text Request

Related items

1	Research On Weakly Supervised Learning Based On Controlled Random Walk Model
2	Research On Weakly-supervised Classification Methods Based On Samples And Labels Modeling
3	Research On The Scheme Of Optimizing The Performance Of Classification Model In Weakly Supervised Learning
4	Robust Image Classification Algorithms With Weakly Supervised Learning
5	Fault Classification With Weakly Supervised Learning
6	Partial Label Learning Algorithms Based On Maximum Margin And Semi-supervised Learning
7	Research On Weakly Supervised Object Detection And Classification Via Multiple Instance Learning Methods
8	Modeling And Optimization Of Latent Variable Model
9	Multi-label Image Classification Under Weakly Supervised Learning
10	Image Anomaly Detection Based On Self-supervised And Weakly-supervised Learning