Font Size: a A A

Generalized Zero-shot Learning Based On Contrastive Learning And Semantic Augmentation

Posted on:2024-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y C TangFull Text:PDF
GTID:2568307115497574Subject:Electronic Information (Computer Technology) (Professional Degree)
Abstract/Summary:PDF Full Text Request
In the field of computer vision,supervised learning models have achieved outstanding performance with the support of a large amount of annotated data.However,the process of collecting labeled data is both time-consuming and labor-intensive,and each category of images requires manual annotation,which undoubtedly significantly increases the cost of deep learning.As a result,zero-shot learning has gradually become a research hotspot in recent years and has been widely applied in addressing various challenges in real-world tasks.In the absence of unknown category information,generalized zero-shot learning relies on additional semantic information to achieve knowledge transfer between known and unknown visual images.However,due to the fact that semantic information and visual images belong to two different modalities of information,a lack of effective constraints may lead to inconsistent manifold distributions when mapping them to the same latent space.Additionally,although manually labeled attribute information builds a bridge to some extent between visible and invisible classes,attribute information contains a large amount of redundant information and artificial labeling errors,which may affect the transfer of critical information between categories and thus reduce model performance.To address these two issues,this paper explores two innovative approaches by increasing the constraints on the latent space embedding and constructing an additional semantic graph,proposing the following two innovations:(1)A generalized zero-shot pre-classification model based on embedding contrastive learning.This classifier utilizes a hyperspherical autoencoder to map visual features to the latent space and narrows the manifold boundary of the visible classes through contrastive learning.By calculating the distance between each visible class manifold boundary and its center,the test samples are divided into visible class samples and invisible class samples.Then,two expert classifiers are used to classify the visible and invisible class samples separately.(2)A generalized zero-shot learning method based on a semantic category graph introduces a semantic category graph to establish relationships between categories,thereby better utilizing the differences and similarities in attribute spaces and additional semantic embeddings to represent categories.Then,through a graph neural network,the semantics of each node are propagated to its neighbors,and a semantic vector is output for each category.Finally,using a nearest-neighbor classifier,predictions are made in the learnable classification space based on the nearest-neighbor relationship between the features of the image to be predicted and the semantic vectors of all categories.The proposed algorithm achieves outstanding performance on multiple benchmark datasets,and extensive experiments demonstrate the excellence and effectiveness of the model.
Keywords/Search Tags:Generalized Zero-Shot Learning, Image Classification, Semantic Augmentation, Contrastive Learning, Graph Neural Networks
PDF Full Text Request
Related items