Research On The Method Of Selecting Samples For Pool Model In Active Learning

Posted on:2021-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Wang

Full Text:PDF

GTID:2428330605979314

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Active learning solves the problem that supervised learning need a large number of training samples,the core is the problem of making the strategy of selecting samples,which achieves the goal of model convergency quickly.With the pool model to select samples,due to selecting samples by batch,the problem of information redundancy maybe existed between samples which lead to the efficiency of active learning is reducing.By researching on the problem of information redundancy,the problem mainly exists in the samples set to be labeled and between the sample set to be labeled and the labeled sample set.On the condition that neural network have MLP(Multi-Layer Perception),the matrix of information redundancy is defined,according to that,the optimization of DRAL(Discrimination And Redundancy Active Learning)and LDRAL(Labeled Discriminative And Redundancy Active Learning)are composed.For the problem of information redundancy which existed in the samples set to be labeled,the optimization of DRAL is proposed,candidate set which is consisted of many unlabeled samples is selected with original method,the initial sample set to be labeled is consisted by selecting samples that are most similar to candidate set,then selecting samples that is the most dissimilar to the samples set to be labeled into itself from the candidate set in each time.For the problem of information redundancy which existed between the samples set to be labeled and the labeled samples set,the optimization of LDRAL is proposed,With the number of iterations increasing,the number of samples in the labeled samples set is more and more large,the cost of computation exceeds the hardware limit.An uncertainty threshold is defined to select uncertain labeled samples set which replace labeled samples set.Selecting samples from candidate set as the samples set to be labeled which is most dissimilar to uncertain labeled samples set.On the Mnist,Fashion-mnist and Cifar-10 datasets with the above two methods,at the same accuracy,using the uncertainty reduction method to select samples,the DRAL method can reduce the number of labeled samples by at least 11%,and the LDRAL method can reduce at least 8.3%,which can optimize the problem of information redundancy effectively.

Keywords/Search Tags:

Selecting sample for pool model, The problem of information redundancy, Samples set to be labeled, Uncertain labeled samples set, Uncertainty reduction method

PDF Full Text Request

Related items

1	Labeled Samples Expansion-based Band Selection For Hyperspectral Image
2	Terrain Classification Of Polarimetric SAR Image With Limited Labeled Samples
3	Application Of A Small Amount Of Labeled Samples Support Vector Machine Classification
4	The Study Of Attribute Reduction Method Based On Core Samples Set
5	The Research Of Small Samples Uncertainty Based On SVM
6	Research On Quantification Method Of CNN Predictive Uncertainty Based On Evidence Theory
7	Research On Fast Sorting Recognition Technology Based On Known Signal Samples And Signal Knowledge
8	Research On Face Recognition Algorithm Based On Pixel Mapping To Construct Virtual Samples
9	Research On Image Classification Algorithm Based On Imbalanced Samples
10	Adaptive classification of scarcely labeled and evolving data streams