Font Size: a A A

Research On Partial Label Learning With Class Imbalance And Unlabeled Data

Posted on:2019-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2428330596460879Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Partial label(PL)learning is an important weakly-supervised learning framework which is widely exists in real-world scenarios,such as computer vision,internet,ecoinformatics and so forth.In partial label learning,each instance is associated with a candidate label set in which its 'ground-truth label' is concealed.As the supervision information is not explicit,traditional supervised learning methods cannot be applied to solve partial label learning problem directly.It is necessary to design specific algorithms for partial label learning which take its intrinsic characteristics into consideration.Current strategies to solve partial label learning problem can be roughly categorized as algorithm adaptation and problem transformation.Algorithm adaptation strategy aims to enable traditional learning algorithms to handle PL training examples.Problem transformation strategy aims to transform PL examples into other forms so as to accommodate traditional learning settings.However,there are lots of problems to be explored in partial label learning research.In this thesis,the two problem of class-imbalance and unlabeled data exploitation are investigated for partial label learning.Firstly,as a multi-class classification framework,the performance of partial label learning algorithm is significantly affected by the class-imbalance problem.Existing class-imbalance learning techniques assume that the 'ground-truth label' of each instance is known,while the true label of each partial label training example is not accessible which makes existing techniques not directly applicable.In this thesis,by considering the intrinsic properties of partial label training examples,a data-level method called Cimap is proposed which integrates disambiguation and over-sampling techniques.Extensive experiments show that Cimap can effectively alleviate the negative influence brought by class-imbalance for partial label learning.Secondly,it is usually difficult to obtain a large number of labeled data,while obtaining unlabeled data is relatively easier.How to exploit unlabeled data to enhance the generalization performance of learning system is of much importance.In this thesis,a novel algorithm named SemiPL is proposed which utilizes the partial label learning framework as an intermediary tool to deal with semi-supervised learning problem.Based on labeled data,SemiPL estimates the candidate labels of unlabeled data iteratively so as to realize the effective use of unlabeled data.Extensive experiments have shown that the performance of SemiPL algorithm is better than several existing semi-supervised learning algorithms.This thesis contains five chapters.In Chapter 1,problem definition,state-of-the-art,and open research issues of partial label learning are introduced.In Chapter 2,existing partial label learning algorithms are briefly reviewed.In Chapter 3,the class-imbalance aware partial label learning algorithm named Cimap is proposed.In Chapter 4,the partial label learning enabled semi-supervised learning algorithm named SemiPL is proposed.In Chapter 5,this thesis concludes.
Keywords/Search Tags:partial label learning, class imbalance, oversampling technique, semi-supervised learning, unlabeled data
PDF Full Text Request
Related items