Font Size: a A A

Research On Batch Active Learning Algorithm Based On Generative Adversarial Networks

Posted on:2021-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2428330611965584Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and wide application of machine learning,active learning to solve the problem of data labeling cost in machine learning tasks has important research value.Recently,some researches have introduced generative adversarial networks into the field of active learning.By generative adversarial networks,decision boundary can be directly marked,and it has been shown that decision boundary annotation can improve the accuracy of the model.In addition,batch active learning alleviates the time consumption problem of conventional active learning methods to a certain extent.However,the most time-consuming step in each iteration of active learning is the process of retraining the model,so other methods should be considered to simplify the training process.In order to combine the advantages of batch sampling and decision boundary annotation,this paper proposes a new active learning method based on generative adversarial networks.This method allows human experts to directly label model decision boundary by generative adversarial networks,and selects samples in batches by decision boundary annotation.Compared with the conventional active learning method,this method avoids retraining the model in each iteration,and reduces the computational burden and time cost.In addition,this paper also proposes a clustering-based initial seed sampling strategy,which improves the proposed active learning method based on generative adversarial networks.This strategy estimates the overall distribution of samples by clustering,and can selectively select the initial seeds of the algorithm based on the sample distribution.The experimental results on three data sets show that the active learning method based on generative adversarial networks can effectively reduce the data labeling cost of building machine learning models,and the initial seed sampling strategy based on clustering can effectively improve the performance of the method.In order to consider multiple sample selection criteria at the same time in batch sampling,this paper also proposes a multi-criteria-based batch sample selection strategy.This strategy takes the informativeness,representativeness,and diversity of samples as constraints for batch sampling,and then designs a sampling objective function and proposes its specific optimization process.Combining the multi-criteria-based batch sample selection strategy with the generative adversarial network-based active learning method,this paper proposes a new batch active learning with two-stage sampling method,namely BALTS,which has the advantages of both.The experimental results on three datasets show that the multi-standard batch sample selection strategy can effectively improve the performance of active learning methods based on generative adversarial networks,and BALTS can more effectively reduce the data labeling cost of building machine learning models.
Keywords/Search Tags:Active learning, GANs, Batch sampling, Decision boundary annotation
PDF Full Text Request
Related items