Local Feature Modeling For Visual Object Categorization

Posted on:2013-12-22

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H X Wang

Full Text:PDF

GTID:1228330395967380

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Visual object categorization is a very challenging key technology in the application of image retrieval, massive video data, visual perception alternative, automatic robot, interactive games, and etc. It has a wide range of application needs and significant value both in applications and in researches. The research goals of visual object categorization are to detect objects in images and to determine the objectâ€™s categories. This dissertation studies local feature modeling for visual object categorization using bag-of-visual-words method, including local feature linear coding, pooling and spatial matching scheme. Some improved methods are proposed. The main work and innovation are presented as following:(1) Locality-constrained principal component linear coding (LPLC) is proposed. Through the linear correlation analysis between the local feature and its K-nearest-neighbor visual words and significance testing of locality-constrained linear coding, this dissertation finds that the fundamental reason for causing nonsignificance of the weight coefficient is the multicollinearity of K-nearest-neighbor visual words. So LPLC is presented. Experiments have been conducted for comparing and evaluating the proposed method utilizing the Caltech-4dataset. Experimental results show that the proposed method improves the classification accuracy.(2) Locality-constrained linear coding based on the principal components of visual vocabulary is proposed. LPLC solves the multicollinearity, but it increases the time overhead of the coding. To determine the principal components of K-nearest-neighbor visual words of each local feature is simplified to only determine the principal components of visual vocabulary. The number of principal components of the visual vocabulary is determined according to the cumulative contribution ratio. Experimental results show that linear encoding time is reduced by1/3using the proposed method; in the case of the comprehensive consideration of coding time and classification results, the proposed method is optimal when the cumulative contribution ratio is85%as well as the number of principal components of the visual vocabulary is20; in the each pooling way, the two proposed linear coding methods all improve the classification accuracy and their classification accuracy results are similar, indicating that locality-constrained linear coding based on the principal components of visual vocabulary reduces the time overhead and the same time it retains the advantages of LPLC.(3) The spatial pyramid matching (SPM) approach, which is based on approximate global geometric correspondence, disregards invariance to translation, scale and rotation of visual objects in images. A novel spatial matching method based on visual vocabulary shape description model is proposed. According to this method, spatial geometric model relative to the geometric center of each visual word is constructed to guarantee translation invariance; Log polar spatial pyramid matching is presented, log polar radius and polar angle are subdivided in proportion and a consistent orientation to visual word is assigned in order to achieve scaling and rotation invariance. Experiments have been conducted for comparing and evaluating the proposed method utilizing the Caltech-4dataset and our own dataset. Experimental results show that the proposed method improves the classification accuracy, especially for the dataset containing images with obvious translation, scaling and rotation changes, and is more robust because of its smaller variance.(4) On the basis of studying locality-constrained linear coding, discuss the pooling of local feature codes and introduce power normalization into local feature modeling. Experiments have been conducted for comparing and evaluating the proposed methods utilizing the Caltech-4dataset. Experimental results show that the introduced power normalization reduces the sparsity of local feature model vector, and realize that the sparsity of local feature model vector can significantly affect classification results.(5) Integrate the proposed methods related to local feature modeling. Experiments have been conducted for comparing and evaluating the integrated method utilizing the Caltech-101dataset and Pascal VOC2007dataset and using Matlab. Experimental results on Caltech-101dataset show that average classification accuracy (ACA) of the integrated method is higher than the other five methods. Experimental results on Caltech-101dataset show that average precision (AP) of the integrated method is similar to the best system in Pascal VOC2007challenge. Finally, we implement a visual object categorization system prototype based on C/S.

Keywords/Search Tags:

local feature linear coding, spatial matching scheme, pooling, localfeature modeling, visual word, visual object categorization

PDF Full Text Request

Related items

1	Research On Object Detection Based On Bag Of Visual Words Model
2	Visual Object Representation Based On Salient Local Features
3	Feature Coding And Its Applications To Image Categorization
4	Research On Robust Visual Object Tracking Algorithm
5	Research On Scene Classification Technologies With The Local Context Feature And Spatial Pyramid Model
6	Image Processing And Object Recognition Based On The Cognitive Mechanism Of Visual Information
7	Research On Object Recognition Technology Based On Visual Attention And Local Invariant Feature
8	Research Of Sparse Coding For Fine-Grained Visual Categorization
9	Biologically Motivated Feature Extraction And Object Categorization
10	Technology Research, Based On Local Features For Image Object Recognition