Font Size: a A A

Research On Strategies For Negative Samples In Data Augmentation

Posted on:2022-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WangFull Text:PDF
GTID:2518306572950899Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data augmentation,whose purpose is to enlarge the dataset from the underlying data distribution,has been a key technology in the machine learning area.In deep learning theory,deep neural networks have the capacity of modeling almost all functions of interest,which results in the problem of overfitting.Due to the effectiveness of data augmentation to alleviate the problem,it has been widely used in several machine learning tasks,such as computer vision,natural language processing,and audio signal processing.Concurrently,mainstream DA technology assumes the invariance between samples before and after DA and neglects the differences between them.Though in most of the cases,the modeling of differences/variances is helpless for deep models,it is not the truth for DA transformations involved in this paper.We refer to the samples modeled differently from samples in the dataset as negative samples.First,we propose a label smoothing strategy for negative samples in supervised classification task,where the negative samples are learned with smoothed labels and the normal positive samples are learned with hard labels.In this way,the positive and negative samples are separated in the representation space and the variance appears.We select three kinds of negative samples from the CV literature.The first one is noisy samples under strong data augmentations.The noise in samples refers to the loss of information caused by the strong augmentation.By learning with these samples by our strategy,models can better understand the corruption in samples and can ameliorate the models' performance in both classification and anomaly detection.The second kind of negative sample is the negative transformed sample.Negatively transformed,the samples locate out of the data manifold.However,when combined with our strategy,these negative samples can help model with classification accuracy.The third kind of negative sample is the in-distribution sample that is arguably harmful to the model learning.By learning with these samples by our strategy,models can benefit from them instead of being impaired.Experiments are performed in three image classification benchmarks CIFAR-10,CIFAR-100,and SVHN.The results demonstrate the effectiveness of our approach.Second,we propose hard negative data augmentation regularization strategy for contrastive learning.Specifically,we introduce extra negative prior as the regularization term to models by negative transformations.We consider the hard negative transformation for global feature learning and saliency object feature learning respectively.For example,the jigsaw transformation belongs to the first kind of transformation.On top of that,we introduce the saliency map as extra information to select the specific area to perform the negative augmentation to improve the ability of saliency feature learning.Experiment results show promising improvements by using our strategy.
Keywords/Search Tags:data augmentation, negative samples, label smoothing, contrastive learning
PDF Full Text Request
Related items