Derivative-free Optimization Method Of Convolution-based Models For Few-shot Learning

Posted on:2022-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y M Li

Full Text:PDF

GTID:2518306323978269

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Convolutional Neural Network(CNN),as a feedforward neural network,has demonstrated to be highly successful in data-intensive applications,and achieved excellent performance in image processing.However,it may be hampered when the data set is small,mostly because of its dependence on the gradient descent and the convolution structures requiring sufficient data size.Since the empirical risk minimizer is no longer reliable in few-shot learning scenarios,recent CNN-based deep learning methods based on gradient descent,network re-structure,or data augmentation,have achieved limited performance improvements.In addition,the convolutional structure and the pooling structure will bring some information loss.Tackling this problem in the object classification field of image processing,with AlexNet as examples,proposed a derivative-free optimization method of convolutional neural network based pre-training models for few-shot learning.First,each sample is generalized into a series of samples based on causal intervention and data augmentation technology,and the non-sequential data is converted into sequence data.At the same time,the size of the sample dataset is expanded.Based on the optimal transport theory and co-integration test,evaluated the feature extraction ability of the sampling points in the model from the perspective of the stability of data distribution,.and then guided the directional network pruning work of the pre-training model,removing part of the noise information extracted by the pre-training model,made the model a better descriptor of the data distribution.Then,based on the capital asset pricing model and optimal transport theory,perform forward learning without gradient propagation in the intermediate output process of the pre-training model,construct a new structure and optimize it,and perform selective recombination of sampling points based on the collaborative relationship between sampling points.generates a representation vector with clear distinction between classes in the distribution space,which gets rid of the dependence of gradient descent algorithm.Finally,the intermediate output of the network would further adaptively generate the effective features and form the embedding representation vector through a self-attention module.Experimental results show that the proposed method can effectively improve the performance of CNN-based pre-trained models such as AlexNet and ResNet in few-shot learning scenarios.Its accuracy on the animal data set(about 100 categories)in ImageNet 2012 is increased from 58.85%,78.51%to 68.50%,86.75%,respectively.

Keywords/Search Tags:

Capital Asset Pricing Model(CAPM), wasserstein distance, derivative-free learning, self-attention, pre-trained model

PDF Full Text Request

Related items

1	Target Similarity Measurement Algorithm Based On Wasserstein Distance
2	Relation Extraction Based On Dualchannel Attention And Pre-Trained Language Model
3	Large Margin Nearest Neighbor Based Distance Metric Learning
4	Research Of Joint Extraction Of Entities And Relations Based On Pre-trained Model
5	Research On Model Free Controller Design Method Based On Neural Network
6	Study On Option Pricing Based On Parametric Model And Nonparametric Machine Learning Model
7	Research On Capital Asset Pricing Based On Deep Learning
8	Research On Sentiment Classification Of Weibo Based On Pre-trained Language Model
9	Research On Efficient Derivative-Free Automatic Machine Learning
10	Research On Sentiment Analysis For E-commerce Product Reviews