Font Size: a A A

Research On Few Shot Image Recognition Based On Semantic Prior

Posted on:2022-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:P B XuFull Text:PDF
GTID:2518306563475644Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,deep learning has achieved great success in the fields of image recognition,speech recognition,and natural language processing.These achievements of deep learning models often rely on large-scale training data,and building large-scale data sets is not only expensive,but also infeasible in certain segmented scenarios such as the medical field.Therefore,few shot learning using a small number of samples for model learning has gradually attracted attention from researchers and has become a recent research hotspot in the field of artificial intelligence.The few shot image recognition technology aims to use the transfer of useful knowledge from the base class containing more training sample image data to help the model identify new target categories where labeled sample image data is extremely scarce.The current mainstream methods for few shot image recognition perform well in training categories,but they perform poorly on small-sample new categories,and lack sufficient generalization ability.At the same time,the few shot image recognition model has the inherent problem of lack of interpretability in the deep learning model.Therefore,this paper proposes the following two models for the two problems in the above few shot image recognition and verifies their effects in the practical application scenarios:(1)A few shot image recognition method based on category semantic similarity is proposed.The word embedding model is used to learn the word embedding vector of each category in the image data set,and the cosine distance between the category word embedding vectors is used to represent the semantic similarity of the category.By integrating the semantic relevance between categories as semantic prior knowledge,the semantic similarity between categories is introduced in the model training stage to establish the connection between the base class and the new class to train a feature with more generalization ability.In this paper,a large number of experiments are carried out on the few shot image recognition benchmark data sets mini Imagenet and tiere Imagenet.The experimental results show that the proposed method has higher recognition accuracy and stronger feature representation generalization ability than the baseline model.(2)A few shot image recognition method based on fusion of attribute category relationships is proposed.The attribute labels incorporated into the sample data are used as a semantic prior to establish the relationship between the base class and the new class.At the same time,the multi-branch network architecture is adopted to allow the model to learn the category labels and attribute labels of the image at the same time.The model weights the attribute prediction results obtained by the attribute branch network,and then combined it with the global features of the backbone network to obtain the feature representation of the fusion attribute category relationship,and finally uses the fused feature representation to classify the image samples to obtain the prediction result.In this paper,a large number of experiments are carried out on the few shot image recognition benchmark data set CUB-200-2011 and AWA.The experimental results show that the proposed method increases the generalization ability of the model while improving the interpretability of the model.(3)Application of genetic syndrome identification with facial appearance.This paper constructs a few shot image data set in a real environment,and verifies the effectiveness of the proposed method in the actual application scenario of recognition of genetic syndrome with facial features.At the same time,this paper also constructs a knowledge graph of facial abnormalities with facial genetic syndrome,which makes the model output more intuitive in practical application scenarios through the method of semantic prior visualization.
Keywords/Search Tags:Few shot learning, Image recognition, Meta learning, Semantic priori, Generalization ability, Interpretability
PDF Full Text Request
Related items