Font Size: a A A

Research On Image Categorization Based On Bag-of-words Model

Posted on:2014-02-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:L N WuFull Text:PDF
GTID:1268330425489187Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet, a large number of digital images arise in our lives, and their number and categories have a massive increase. Image categorization has gained more and more attention as it can help people organize and manage images effectively. Bag-of-visual words(BOV) model which is based on local features for image categorization has been shown to yield state-of-the-art results.An important research on BOV model is how to create and improve vocabulary to represent images effectively and improve performance of BOV. Another important research is the transfer learning of BOV, which can avoid BOV model learning from the beginning for each category. The transfer learning can retain good performance in the learning task when there are only a few images.This paper analyzes each step (feature extraction, feature description, vector quantitation, classifier learning) of BOV model, and improves the vocabulary to fit transfer learning.This paper studies on the methods of optimizing and improving visual vocabulary, which aims at creating visual vocabulary for transfer learning. This paper proposes that creating visual phrases through the composition of several visual words by utilizing spatial information. The visual phrase can find and represent common local spatial information among different image categories, and avoid semantic ambiguity, which can be transferred to visual vocabulary of a novel image category. There are two parts of research in this paper:the first part is that how to obtain discriminative vocabulary and a set of phrases with spatial information, which can provide necessary knowledge (appearance information and spatial information); the second part is that how to make use of learned knowledge to speed the learning of a new category and improve its performance, especially when there are only a few training images. The main creative work and research of the paper is summarized as follows:1. A weighted minimal-redundancy-maximal-relevance criterion (WMR-MR) is defined. The criterion of WMR-MR considers both the redundancy between one word and another and the relevance of between a word and the category. The algorithm improves a vocabulary by eliminating redundant words which have less relevance with its category. Discriminative vocabulary with a relative small vocabulary is obtained which can solve the problem that large vocabulary can result in complicated computing and redundant words. The vocabulary obtained by this algorithm can provide a basis for creating visual phrases and the transfer learning of phrases.2. An algorithm of creating visual phrases with local spatial information is proposed. The position information can be obtained when extracting local features. According to this, stable neighbor relation between visual words can be modeled by visual phrases. Compared with global spatial information, the local spatial information of the visual phrases can deal with intra-class variation, which has strong robustness. Moreover, the visual phrases are helpful for eliminating ambiguity when a visual word is used for image categorization individually. So visual phrases are more reliable than words. Visual words which represent appearance information of local features and visual phrases which represent local spatial information are integrated to form two sources of information for categorization. The algorithm can balance these two sources by adjusting the weight for various image categories, so it can be applied in different image categorization.3. A transfer learning algorithm based on visual phrases is proposed. The algorithm describe the common features of various image categories by visual phrases, which is aimed at making use of learned knowledge to help the learning of a novel image categorization. During the learning of a novel category, the algorithm adjusts visual phrases to transfer by the way of iteration to retain visual phrases which are helpful for image categorization. The retained visual phrases can make the classifier of the novel categorization have a good performance after transfer learning. Compared with relearning vocabulary, this transfer learning algorithm makes use of learned knowledge effectively, which can gain good performance especially in the situation when there are a few images in a novel category.
Keywords/Search Tags:Imagc categorization, The bag-of-visual words model, Visual word, Visualphrase, Transfer learning
PDF Full Text Request
Related items