Font Size: a A A

Positive And Unlabeled Learning Based On Loss Decomposition And Centroid Estimation

Posted on:2021-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:H ShiFull Text:PDF
GTID:2518306512487274Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Recently,machine learning has been an active research topic and involves all aspects of human's lives.Machine learning aims to extract general knowledge from a set of known data,and can be used to deal with or analyze unknown data.PU learning(Positive and Unlabeled learning),which belongs to weakly supervised learning,is proposed to accurately learn a binary classifier only from PU datasets.Actually,PU learning is prevalent in many practical situations,where acquisition of negative examples is too difficult or expensive.Therefore,this paper focuses on the PU learning algorithms based on loss decomposition and centroid estimation,which mainly includes the following three aspects:1)A linear PU learning algorithm based on Loss Decomposition and Centroid Es-timation(LDCE)is proposed.In this algorithm,the unlabeled dataset is regarded as a noisy negative dataset,and then the empirical risk on PU datasets is divided into two parts:the noise related part and the noise irrelevant part.Finally,the unbiased estima-tion of the real empirical risk on PU datasets is obtained by introducing the unbiased centroid estimation which can eliminate the impact of noise.2)A kernelized model termed Kernelized LDCE(KLDCE)is proposed.Based on the aforementioned LDCE algorithm,KLDCE is designed to tackle the PU learning problems in non-linear cases by introducing the kernel function,SMO algorithm,and ACS method.Moreover,the generalization bound of the KLDCE algorithm is theoretically proved based on the Rademacher complexity analysis,which reveals the reliability of the KLDCE algorithm.3)Diverse experimental results on synthetic and practical datasets,including clas-sification accuracies and decision boundaries,indicate that our algorithms are superior to the existing state-of-the-art PU learning algorithms.By analyzing the sensitivity of parameters,the CPU time of LDCE and KLDCE algorithms,the impact of inexact flip-ping probaility,and the importance of the constraint of our model,the stability and effectiveness of the algorithms proposed in this paper can be further demonstrated.
Keywords/Search Tags:PU Learning, Loss Decomposition, Centroid Estimation, Kernel Function, Rademacher Complexity
PDF Full Text Request
Related items