Font Size: a A A

Research On The Utilization Techniques Of Partial Label Data

Posted on:2022-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z R ZhangFull Text:PDF
GTID:2518306740482864Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In traditional supervised learning,each object is represented by a single instance and associated with a single label which provides explicit supervision information.In partial label learning,one instance is associated with a set of candidate labels,among which only one label is valid but unknown.Single-labeled examples are more conducive to model training,but the cost of data acquisition is higher.The ground-truth label of partial label examples is invisible to the model and thus makes model training more challenging,while the cost of data acquisition is lower.In this paper,we investigate the utilization techniques of partial label examples and have proposed two methods correspondingly.On one hand,we bridge labeled examples and unlabeled examples in model training via the utilization of partial label examples,and propose a novel semi-supervised learning approach named Eupal.Specifically,Eupal makes use of unlabeled data in a new manner by trying to estimate its partial label assignment consisting of the ground-truth label,which is less difficult than estimating the exact ground-truth label of unlabeled example as most existing semisupervised learning approaches do.Accordingly,Eupal induces multi-class classifiers based on labeled examples and estimated partial label examples respectively and performs model update via labeling information communication.Experimental results show that Eupal is capable of improving the classification performance of semi-supervised learning model with partial label data utilization.On the other hand,in light of the class-imbalance and noisy labels issues in multi-label classification,a class-imbalance aware partial multi-label learning approach named Ipml is proposed.Considering the characteristics of multi-label and partial label examples,Ipml disambiguates the partial multi-label data based on k-nearest neighbor aggregation,and designs three types of data level class imbalance processing strategies including random oversampling,weighted oversampling and synthetic oversampling.Extensive experiments on artificial and real-world data sets show that Ipml can effectively alleviate the class imbalance problem in partial multi-label learning.This paper consists of four chapters.The first chapter introduces the background,related work and the problems to be solved.The second chapter introduces the semi-supervised learning approach Eupal via partial label data utilization.The third chapter introduces the classimbalance learning approach Ipml via partial label data utilization.Finally,the fourth chapter concludes the thesis.
Keywords/Search Tags:partial label learning, semi-supervised learning, class-imbalance, partial multi-label learning, weakly supervised learning
PDF Full Text Request
Related items