Font Size: a A A

Research On Partially Labeled Problem Based On Active Learning And Semi-supervised Mechanism

Posted on:2022-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2518306512461954Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Partial label learning is a type of weak supervised learning.Its difference from traditional supervised learning is that the label of each training example is not clear,but is expressed as a candidate label set,and the only true label of each training example is included in the candidate label set of the example.Most of the related researches assume that there are a lot of available training samples with partial labels in advance,that is to assume that the candidate label set is easy to obtain.In many practical problems,however,there are still a large number of unlabeled samples,and obtaining their partial labels is costly.Moreover,these large amounts of unlabeled data contain the distribution information of the data,which is of great value to model training.Therefore,how to make full use of the unlabeled samples while using partially labeled samples in model training is a problem worthy of discussion.Based on this,this thesis completes the following two aspects of work:1.we consider the problem of partial labeling learning that uses a small number of partial label samples and a large number of unlabeled samples to form the training set,and for the first time propose a partial labeled learning method based on active learning mechanism(APLL for short)to construct an effective classifier.Firstly,the weak supervised information in candidate label set is used to determine the possible labels of the partial labeled samples by using iterative label transfer process;then an adaptive sample selection strategy in active learning framework is proposed to comprehensively measure the labeling value of each unlabeled sample based on its uncertainty,graph density and label transfer ability,and the most valuable unlabeled samples are selected for manual labeling.Finally,the labeled samples are used to re-optimize the existing partial labeled samples,and the final classifier is trained.Compared with the five most representative partial labeling methods,APLL only needs a small amount of manual labeling to achieve a significant improvement in classification accuracy.2.For those partial label problems with expensive manual labeling cost,this research proposes self-training semi-supervised partial label learning method which uses the similar characteristics of pseudo-labels generated by semi-supervised and the original candidate label sets.The difference of this method and APLL lies in the way of using the unlabeled samples.Instead of using the active learning mechanism of manual labeling,the self-training semisupervised method selects candidate unlabeled samples through the uncertainty criterion based on the results of the current classifier.Then,the pseudo-labels are generated using the mapbased labeling transfer method for the selected samples,and they are added to the partial labeling sample set.This method is suitable for the partial label problem with high manual labeling cost.Since the information of unlabeled samples is used in the training process through the semi-supervised mechanism,the proposed method has obtained better performance in the related partial labeling learning methods.
Keywords/Search Tags:Partial label learning, Active learning, Sample selection strategy, Self-training semi-supervised learning, Pseudo-label generation
PDF Full Text Request
Related items