Font Size: a A A

The Research Of Fine-grained Image Categorization Based On Traditional Methods And Deep Learning

Posted on:2015-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:H H ZhangFull Text:PDF
GTID:2308330473956994Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Classification of objects in large-scale image datasets has been a hot topic for many years. It is a basic way towards image understanding and implies a wide range of applications. Today, one of the most popular methods of image classification is to represent images with long vectors, and use a standard classifier for training and testing.Traditional Bag-of-Features (BoF) framework is widely used for image representation. It is a statistics-based model which summarizes local features in a sparse vector. Despite the great success of the model, it still suffers from the well-known semantic gap between low-level features and high-level concepts, as well as the poor object alignment on images. Recent years, researchers proposed new approaches to deal with the above problems. Successful examples include extracting different kinds of descriptors, building mid-level representation, spatial weighting and so on. Systems with these new modules produce state-of-the-art classification performances, but the connection between image representation and image semantics is still weak.Fortunately, as evidences accumulate in Neuroscience, researchers realize that human beings recognize objects using a combinational representation of local features. It suggests a structural model for learning high-level concepts. However, traditional image collections usually contain a large number of irrelevant concepts, which limits the Computer Vision algorithms from learning structural models with few training examples. Therefore, a promising direction is to consider Fine-Grained Visual Categorization (FGVC), in which we are dealing with image categories sharing similar semantics. However, the BoF framework gives poor performances on FGVC tasks.To focus on object description, we study a special topic of image classification, i.e., Fine-Grained Visual Categorization (FGVC). It provides good opportunities for a deep learning of high-level concepts. However, traditional BoF model gives poor performances on FGVC tasks, due to the lack of using fine-grained properties. In this paper, we propose a novel model named Hierarchical Part Matching (HPM) by developing three new modules:(1) foreground inference and segmentation; (2) Hierarchical Structure Learning; and (3) Geometric Phrase Pooling. Using ground truth annotations, our approach achieves better image representation, and overwhelmingly outperforms the state-of-the-art algorithms on a challenging FGVC dataset.We also use the convolutional neural networks which is an important structure of deep learning method to improve the Fine-Grained Visual Categorization problem.
Keywords/Search Tags:Fine-Grained Visual Categorization, Bag-of-Features, deep learning, convolutional neural networks
PDF Full Text Request
Related items