Font Size: a A A

Research On Multi-label Learning With Inaccurate Labels

Posted on:2021-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:S HeFull Text:PDF
GTID:2428330611464275Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification tasks in machine learning,aims to learn desired models based on training samples with known classes,and the learned models are able to predict the classes for unknown samples.According to the different settings in label space,the classification tasks can be divided into three lines: binary classification,multi-class classification and multi-label classification.The first two cases can be called single-label classification,which implies each of their samples corresponds to only one label.The multi-label classification further extends their settings so that each sample corresponds to more than one label.This is also more in line with real-world application scenarios,where a real-world object is associated with multiple semantic classes normally.However,it's quite difficult,even impossible,to obtain all the ground-truth labels of a multi-label sample.On the one hand,it is time-consuming and laborious to manually label samples with many classes,which is easy to make mistakes even for adept workers.On the other hand,due to the inherent semantic ambiguity between labels,it tends to produce many false noisy labels.Therefore,multi-label samples inevitably contain many noisy labels,which will directly mislead the training process of the model,and deteriorate the generalization ability of the model.Hence,this paper aims to train a robust model based on multi-label samples with inaccurate labels,so as to mitigate the adverse effects of noisy labels for the model.Specifically,this paper focuses on two learning paradigms with noisy labels: partial label learning(PL)and partial multi-label learning(PML).In the former,there is only one ground-truth label in the candidate label set,while in the latter,there are several unknown ground-truth labels in the candidate label set.In this paper,a unified training framework is designed for these three classification learning tasks,which consists of two iterative mutually reinforcing processes: model training and label confidence estimation.In the former,instead from using the original discrete label values,we dynamically utilize the label confidence estimated in the last iteration,which contains more semantic information and provides more discernable supervision information for the model training.For the latter,the model output of the previous iteration and the corresponding regularization terms are used to dynamically estimate the label confidence.In this estimation process,three customized regularization terms and corresponding restrictions will be utilized to cope with three different learning tasks respectively.(1)In traditional multi-label learning(ML),this paper proposes a joint regularizer based on instance graph and label correlations to restrict the label confidence matrix.As for the restrict conditions,we expand the overall scales of label condfidences to satisfy the actual requirement.(2)In partial label learning,the entropy of label confidence is used as the regularizer to polarize the confidences of candidate labels.Since only one of candidate labels of PLL sample is a ground-truth label,we set the condidate label confidences a standard probability distribution.(3)In partial multi-label learning,candidate labels are divided into reliable candidate labels and unreliable candidate labels,and a soft sign thresholding operation is proposed to adaptively increase the confidence of reliable candidate labels and reduce the confidence of unreliable candidate labels.Since PML samples include several ground-truth labels,we set the confidences of candidate labels be relevant to a parameter ?(i.e.,the average confidences of ground-truth labels).Through a large number of experiments on artificial and real-world data sets,the effectiveness of the three proposed algorithms are proved.This paper includes five chapters.Chapter 1 introduces the research background of three learning paradigms.In chapter 2,3 and 4,different improved algorithms are proposed respectively.Chapter 5 summarizes the work of this paper and the future work.
Keywords/Search Tags:multi-label learning, inaccurate labels, candidate labels, label confidence
PDF Full Text Request
Related items