Font Size: a A A

Research On Few-shot Learning Based Via Data Augmentation

Posted on:2022-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y CaoFull Text:PDF
GTID:2518306764976839Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the development of informatization and intelligentization,the utilization,analysis and mining of massive data has gradually become a hot topic in the society,attracting widespread attention.However,the wider sources and the increasing amount of data are at the expense of costing huge labor and time in correctly labeling these massive data.Moreover,some fields,such as medicine and anomaly detection,are confined by their limited amount of data.Thus,training the model by using massive data is difficult for this situation.Therefore,finding out the model which could learn quickly from a small number of samples has become a pain point,causing the field of few shot learning receiving much attention right now.At present,the solution to the few shot problem in the field mainly focuses on the metric-based few shot learning,which is,the model can learn more discriminate representations from a small number of samples; the few shot learning based on transfer learning,which is,by transferring the knowledge from already learnt tasks to improve the the future learning tasks.The two algorithms above mainly revolve the model or the algorithm framework itself,without considering about the data.If the data set can be expanded to a certain amount the few shot problem can be transformed into a normal classification or regression problem,then the problems caused by small samples can be effectively alleviated.This paper focuses on few shot learning based on data augmentation,that is,using a small amount of existing labeled data or other unlabeled data to generate more equivalent data to achieve the purpose of expanding the sample set,thereby avoiding the problems like low robustness and over fitting occur.This paper mainly implements the augmentation and expansion of the small sample data set from the following two ideas.First,in view of both time and labor consuming problems in collecting and labeling the massive data,and the fact that unlabeled data from most scenarios could be relatively easily obtained from wider sources,this paper proposes a few shot learning algorithm based on weak labels.The algorithm uses the small amount of labeled data to label the unlabeled data with weak labels and gives the confidence of the corresponding label through the deep neural network model,selects the samples with higher confidence to expand the data set,and then applies the processed data in iterative training model,to solve the drawbacks caused by the limited samples.Experiments show that the algorithm can improve the accuracy of the benchmark algorithm by 5% on the public data set.Second,in view of the problem that the current data augmentation methods focus mainly on the transformation of the sample space,while less on the feature latent space,and considering the redundant information in the sample space,this paper proposes a data augmentation algorithm based on latent representation space.The purpose of the algorithm is to divide the algorithm process into two phases: representation and decision.For the representation stage,the paired input mechanism is used to train the deep network to maintain good performance in the case of small samples; then data augmentation is performed in the representation space to expand the small sample data; finally,the model is used in the decision stage to achieve classification.To verify the effectiveness of the algorithm,training and testing are performed on public data and industrial-grade data.The experimental results show that the new algorithm achieves the highest accuracy compared with the existing algorithms.
Keywords/Search Tags:Few shot Learning, Data Augmentation, Deep Neural Network, Weak Labeling, Data Mining
PDF Full Text Request
Related items