Font Size: a A A

A Visual-Semantic Embedding Method For Zero-shot Object Recognition

Posted on:2019-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:X B HanFull Text:PDF
GTID:2518306473454004Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Object recognition is a very important research topic in computer vision.The task of Zero-shot object recognition is to identify categories that don't appear in the training set during the testing stage.There are several disadvantages of traditional object recognition algorithms.First,the labeled dataset in the training stage is scarce.Second,the categories that can be recognized are far less than all the categories in nature.Third,adding a new category requires retraining the model.Therefore,Zero-shot object recognition has received more and more attention in recent years.This paper briefly introduces the common methods of Zero-shot object recognition:attribute-based method,hierarchy-based method and visual-semantic embedding,and deeply analyzes the existing problems in the most popular method,visual-semantic embedding.In order to improve these deficiencies,Hierarchical network embedding and Dynamic hybrid model are proposed.The main contributions of this paper is showed as below:(1)Aiming at the shortcomings of word2vec representing taxonomical similarity,as well as semantic shifting,ambiguities of polysemy and noun phrase,this paper learns from the idea of transfer learning,introduces the concept of hierarchical structure in taxonomy,Entity representation based on Hierarchical network embedding are proposed,which uses adjacent category nodes in the taxonomic hierarchy to represent an object category.It effec-tively enhances the representation ability of the entity.We also design experiment to verify the improvement of Hierarchical network embedding.(2)Aiming at the problems caused by the fixed size of candidate set in the hybrid model,the existence of exponential distribution in the image is proved by experiments.This paper proposes Dynamic embedding model which chooses the candidate sets according to differ-ent probability distributions of unseen images.The middle vectors mapped from unseen images to the semantic space can be calculated effectively.The hierarchical-dynamic visual-semantic embedding model is further proposed.In this paper,we experiment on a large-scale dataset ImageNet,compare with other competitors,and prove that our model achieves state-of-the-art performance.Finally,this paper designs and implements the prototype system for Zero-shot object recognition.
Keywords/Search Tags:Zero-shot learning, object recognition, visual-semantic vmbedding, deep learning, convolution neural network, word embedding, entity representation
PDF Full Text Request
Related items