Font Size: a A A

Object Detection Combining Position Information With Invariant Local Features

Posted on:2009-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:J Z WuFull Text:PDF
GTID:2178360278957015Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Over 80 percents of the information acquired by human from the external world are supplied by vision, and the vision of human outperforms machine vision systems with respect to almost any measure, and emulating the processing of visual information in cortex has always been an attractive topic. In the area of object recognition, T. Poggio et al. from MIT proposed a hierarchical max model (HMAX) of feed-forward information processing in the ventral visual pathway, and this model has already gained good performance and invariance in the classification of natural images through continuous improvement. On the other hand, graphic probabilistic methods that can model relations among parts have gained increasingly more applications in the computer vision literature, and these models can effectively emulate the process forming an integral system with particular functions from interrelations among parts.The HMAX model omits all the global position information of local features, while only depending upon the blurred appearance of object parts can hardly gain satisfying recognition performance under the condition of vague object appearance. We propose in this paper an algorithm modeling the global spatial relations among object parts with Gaussian Markov Random Fields (GMRF) on the basis of the HMAX model. The process of this algorithm is listed below. First, we randomly sample local features (object parts) and generate hierarchical representations of images in a similar way to the"Standard Model"(which further develops the initial HMAX model) of visual cortex; then, we pick out a unique location of each part among those local maxima in S2 layers by a matching procedure, and the resultant positions of parts serve as spatial configurations to learn the spatial prior at the next step; at last, we model the spatial relations among parts as a sparse GMRF graph, and learn the sparse links between pairs of parts by a lasso-based approach using those part configurations calculated in the last step. Object localization in new images proceeds by maximizing the posterior of an object observed at a particular configuration. Experimental results on the CalTech101 database demonstrate that the proposed algorithm locates the components more precisely and outperforms the"standard model"in object detection, and prove that performing"feature binding"and additionally representing spatial information of parts can further improve recognition performance.
Keywords/Search Tags:Object class recognition, HMAX, Invariance, Graphic probabilistic model, GMRF, Spatial configuration
PDF Full Text Request
Related items