Font Size: a A A

Multimodal-based Embedding For Fine-grained Image Classification

Posted on:2019-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:H P XuFull Text:PDF
GTID:2428330596460897Subject:Software engineering
Abstract/Summary:PDF Full Text Request
This thesis investigates a challenging problem,which is known as fine-grained image classification(FGIC).Different from conventional computer vision problems,FGIC suffers from the large intra-class diversities and subtle inter-class differences.Existing FGIC approaches are limited to explore only the visual information embedded in the images.In this thesis,we show that leveraging prior external knowledge can significantly benefit FGIC.In this thesis,we present a novel approach which can use handy prior knowledge from either structured knowledge bases or unstructured text to facilitate FGIC.Specifically,we propose a visual-semantic embedding model which explores semantic embedding from knowledge bases and text,and further train a novel end-to-end CNN framework to linearly map image features to a rich semantic embedding space.The framework is a two-level CNN.The first level is a detection network for capturing the local feature of the object in an image.The second level is a classification network for capturing the global feature and linearly mapping image features to a rich semantic embedding space.The main contributions of this thesis as follows:(i)To the best of our knowledge,this is the first work that considers combining text and knowledge bases in one framework for fine-grained image classification.And external prior knowledge is designed in a sophisticated way to linearly map the visual space to multiple semantic spaces via a visual-embedding approach.(ii)This thesis also proposes a multitask learning framework that integrates the detection and the classification mechanism by using Hadamard product.(iii)Experimental results on a challenging large-scale UCSD Bird-200-2011 dataset verify that our approach outperforms several state-of-the-art methods with significant advances.
Keywords/Search Tags:Fine-grained Image Classification, Convolutional Neural Network, Embedding, Knowledge Base
PDF Full Text Request
Related items