Font Size: a A A

Research On The Key Problems Of Multi-Output Classification And Its Extensions

Posted on:2020-03-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C MaFull Text:PDF
GTID:1488306494469894Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Multi-output classification aims to predict multiple outputs for an input,where the output values are characterized by diverse data types,such as binary,nominal and ordinal.Such learning tasks arise in a variety of real-world applications,ranging from document classification,computer emulation,sensor network analysis,information retrieval,to video analysis,image annotation,gene function prediction and brain science.In recent years,this learning paradigm has received extensive attentions in the field of machine learning.In particular,a workshop was held specifically for the learning paradigm at the Asian Conference on Machine Learning(ACML)in Beijing in November 2018.Nevertheless,existing researches mainly focus on supervised multi-output classification tasks(multi-class classification,multi-label classification)with simple output structures,and less consider the real-world applications often involving diverse output space structures and incomplete output information,making it difficult to meet the needs of complex multi-output classification tasks.To overcome these drawbacks,this work attempts to comprehensively consider the various output space structures characterized by various variables,and to directly face more complex(supervised and weakly supervised)multi-output classification tasks,so it can meet the increasingly urgent real demands.The main contributions of this paper are as follows:(1)We propose a novel method,i.e.,SparseSBLR(sparse similarity-based multi-label learning by logistic regression),for a special multi-output classification task,whose output space is characterized by multiple binary variables or single nominal variable.Specifically,SparseSBLR unites distance-based learning and generalized linear models to achieve the best of both worlds.Each learned parameter of the model can reveal the contribution of one class to another,providing interpretability to some extent.Moreover,since most classes are irrelevant,this method can also obtain sparse output space structure and thus prevent from impairing performance of irrelevant classes.Experimental results show the effectiveness of the proposed method on such multi-output classification problems.(2)We propose a novel method,i.e.,MLKT-gMML-I(combining muti-label like transformation approach with iterative geometric mean metric learning),for a special multi-output classification task,whose output space is characterized by multiple nominal variables.Because the output space of this task often involves structures not only among class variables but also among their class values,the output space structure of such multi-output classification tasks is complex.In order to model such an output space structure,this paper adopts the strategy of first transforming the output space and then learning in the transformed space.Accordingly,we propose a novel method,i.e.,MLKT-gMMLI,whose contributions to such multi-output classification tasks mainly lie in two aspects.Firstly,the proposed transforming method equivalently converts the output space characterized by multiple nominal variables into an output space characterized by constrained multiple binary variables.Secondly,for learning the transformed problem,a metric-based learning model is proposed.This model has a closed-form solution and thus accelerates the training process.Secondly,the metric-based learning model proposed for learning conversion problem has a closed solution,which accelerates the training process of the model.Moreover,the learned metric can indirectly reflect the output space structure of the original multi-output classification task.The learned metric can indirectly reveal the output space structure of the original multi-output classification task.Extensive experiments show the effectiveness of the proposed method(3)We propose a novel learning framework,i.e.,ConMOOC(convex multiple ordinal output classification),for a special multi-output classification task,whose output space is characterized by multiple ordinal variables.The output space structures to be learned not only involve the relationships among multiple ordinal variables but also their discrete ordinal values.This paper proposes an effective formulation to address the above challenging problems.Under this formulation,the objective function is convex.Specifically,we use an effective threshold-based loss function to fit their ordinal values and a regularization formulation to model the relationships among multiple ordinal variables Experimental analysis shows that ConMOOC can not only obtain effective classification performance,but also obtain explicit relationships among variables(4)We propose a novel method,i.e.,LS-eScrm+LP(combining a least squares estimated structured conditional risk minimization approach and an inference method via using linear programming),for a special multi-output classification task,whose output space is characterized by diverse heterogeneous variables.To the best of our knowledge,this task is studied for the first time.Such multi-output classification tasks are far more complex than the above ones,in which the output space structure involves the relationship not only among diverse class variables,but also among the class values of diverse class variables.In order to model such complex output space structure,this paper also adopts the strategy of first transforming the output space and then learning in the new space The resulting method LS-eScrm+LP contributes to such tasks lied in two aspects.Firstly,the proposed transforming method LP equivalently converts the complex output space characterized by multiple heterogeneous variables into the output space characterized by multiple homogeneous binary variables with constrains.Secondly,for learning the transformed problem,the proposed structured prediction method LS-eScrm can not only keep the predicting efficient,but also embed the implicit structure among class variables into model learning to improve classification accuracy.Extensive experiments show the effectiveness of LS-eScrm+LP on such multi-output classification tasks(5)We propose a novel method,i.e.,DM2L(discriminant multi-label learning),for a special weakly supervised multi-output classification task with incomplete output information,the output space of this task is characterized by multiple binary variables.In reality,there are a large number of weakly supervised multi-output classification problems with incomplete output information We develop an effective yet simple method DM2L.Specifically,we impose on the one hand a low-rank structure for the predictions of instances from the same labels,and on the other hand a maximally separated structure for the predictions of instances from different labels.In this way,these low-rank structures can help both modeling local and global label structures and recovering missing labels.Our theoretical analysis subsequently supports the results.Moreover,the maximally separated structure can provide more underlying discriminant information.Extensive experiments demonstrate the effectiveness of the proposed approach.
Keywords/Search Tags:machine learning, multi-output classification, supervised learning, weakly supervised learning, output space structure, ordinal variable, binary variable, nominal variable
PDF Full Text Request
Related items