| In recent years,deep learning has made unprecedented breakthroughs in many fields such as computer vision,natural language processing and data mining.This is largely due to advances in three key factors: computing power,algorithms and massive amounts of data.The first two are almost universal across all application scenarios,but data limitations greatly affect the range of usage scenarios.Not only is the process of data annotation extremely time-consuming and labor-intensive,but it is even difficult to collect enough samples in more scenarios(such as rare disease data,accidental industrial defects,etc.).Therefore,in recent years,the field of ”Few-shot learning” has received more and more attention and exploration.The current few-shot learning methods alleviate the above problems to some extent,but it still faces important challenges in the following aspects.First,how to fully explore and utilize the diversity correlation among samples under the condition of sample limitation.Secondly,how to stabilize the training process and maintain the semantic stability under the framework of meta-learning training.Finally,how to mitigate the performance degradation caused by the difference distribution between test data and training data.This paper takes few-shot learning as a scientific problem,takes image feature extraction and analysis technology as the core,and focuses on two typical tasks,few-shot classification task and cross-domain few-shot classification task.Aiming at the above pain points and difficulties,we introduced graph neural network,word vector and other technologies,improved meta-learning training strategies,designed task-level self-supervised learning framework and finally formed a unified cross-domain few-shot learning paradigm.The innovative achievements are mainly in the following aspects:(1)We try to solve the few-shot classification problem by mining the correlation among samples.To this end,we propose prototypical graph network(PGN)to fully mine and exploit the correlation among all samples.It consists of two parts,the first is a fully connected graph neural network.Different from previous methods,the proposed method not only includes node/edge feature updates,but also refines node labels alternately.The collaborative utilization of feature update and label propagation explores the correlation of both to a greater extent.Secondly,the prototype points of each class are calculated to mine the common features of the classes.A geometric constraint term using prototype points is further introduced into the training loss function to improve the robustness of PGN training.(2)To address the problem of semantic instability in the meta-learning training framework,which leads to ambiguity in the embedding space,we propose to establish consistency between ”episodes” during training.We propose a novel multi-modality graph neural network(CMGNN)for exploring correlations between tasks to achieve consistent global embeddings.Since the semantic information,which produced by NLP,is relatively fixed with respect to the visual information space.We exploit it to construct meta-nodes for each category through an attention mechanism.At the same time,in order to promote the information dissemination of meta-nodes and guide the GNN to perform the corresponding visual feature learning,we synthesize virtual nodes using manifold mixing as transitions between real sample nodes.Furthermore,to ensure global embedding,we design a distance loss function to force visual nodes to cluster to a greater extent towards their associated meta-nodes.(3)Most existing methods perform poorly when facing cross-domain few-shot tasks,that is,huge domain shifts between seen and unseen classes.We attribute this to the fact that current meta-learning strategies ignore domain adaptation of the model using support set samples.Therefore,We first propose a two-layer interlude strategy(BL-ES)to train the”Inductive graph Network”(IGN),which enables the model to have both the ”inductive”ability obtained by the traditional training strategy and the ”contrastive” ability obtained by the meta-learning strategy.Firstly,the outer loop in BL-ES continuously simulates the few-shot task across domains,while the inner loop trains IGN to extract the common features of the test categories.Then,IGN mines the correlation between all samples to update the meta-points of each category in the induction module.Finally,we introduce a geometric constraint term in the training loss to update the nodes and edges in the feature space using meta-points.(4)After solving the few-shot cross-domain problem from the perspective of domain adaptation,we also attempt to solve this problem from the perspective of domain generalization.We propose a task-level self-supervised(TL-SS)learning framework to improve the generalization of few-shot learning methods.The TL-SS strategy generalizes the general idea of label-based instance-level supervision to task-level self-supervision by constructing multiple views of a task.At the same time,two regularization methods of task consistency and correlation measure are introduced to significantly stabilize the training process.We also propose a high-order associative encoder(HAE)that utilizes a3 D convolutional module to generate suitable parameters,allowing the encoder to flexibly handle unseen tasks.The two modules complement each other and show a large improvement compared to the state-of-the-art methods experimentally.Finally,we also design a more generalized task-agnosis test to deeply discuss the generalization ability of few-shot learning. |