Font Size: a A A

Research On Label Noise Processing Technology Based On Active Learning

Posted on:2020-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:L YuanFull Text:PDF
GTID:2428330590971771Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In order to solve the problem of sparse data labeling and related high cost,some researchers tend to use active learning mechanism to obtain data labels through crowdsourcing.However,in many real tasks,we usually have multiple labelers with the diverse expertise,which causes label noise.As a consequence,A novel Label Noise Query and Correction(LNQC)algorithm is proposed in this work.It can automatically find the noisy data from labeled data sets and then delivers them to experts.The main contributions of this work are as follows:1.Label Noise Query and Correction(LNQC)is proposed to precisely identify label noise clusters,which can construct a confidence matrix and a residual matrix for each sample as well as define a symbol that merges the confidence matrix and the residual matrix into a single feature matrix,based on the performance of each sample in each classifier(confidence).Then,K-means clustering algorithm is used to cluster the feature matrix for gaining the label noise clusters.Finally the samples of these clusters labeled by experts.2.The LNQC algorithm needs to label all the samples among the label noise clusters by experts,which is costly.In order to solve this problem,this paper proposes a Sample Importance Value Sorting(SIVS)algorithm based on LNQC algorithm.SIVS algorithm combines the advantages of information entropy and KNN algorithm,and selects the most important TOP-K samples to label.3.Compared with the adaptive noise correction(Adaptive Voting Noise Correction,AVNC)algorithm and cluster based correction,the F1 measure and ACC rates of the proposed algorithm are better than those of the AVNC algorithm in 8 UCI real data.The query and correction of label noise will make up for the defects of traditional active learning and provide better support for the development of artificial intelligence...
Keywords/Search Tags:Active Learning, Crowdsourcing, Label Noise, Classification
PDF Full Text Request
Related items