Font Size: a A A

Image Category Recognition Based On Local Features And Probabilistic Graphical Models

Posted on:2009-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2178360242480976Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The problem to be solved in this article is visual object categorization. Thisis one of important problem in the area of computer visual. We can define visualobject categorization as the process of assigning a specific object to a certaincategory. However, despite decades of intensive research, even the most sophis-ticated recognition systems today remain incapable of handling more than justa few simple classes, or of functioning under unconstrained real-world condi-tions. The problem is that computers cannot see very well, at the present time:they are unable to interpret the colored pixels in the images into the higher levelrepresentations as humans do. Describing an image by its contents is in manycases far more useful than the original pixel representation, which is why ourvisual system is so good at recognition. Looking at humans, and comparingtheir recognition performance with artificial systems, it turns out that humansare much better in categorization than machines.This article presents a probabilistic model for visual object categorizationin weak supervision. The model is improved from Probabilistic Latent Seman-tic Analysis. Despite this model is introduced into the area of computer visual,however, it does not use spacial information in images, but is only dependenton the set of extracted features of local regions. we improve their work byadding spacial information invariant on scale and translation changes. We callthis novel method'Local Spacial Relation pLSA'. This model extracts local fea-tures as input by using techniques of local features detectors and descriptions,and is modeled by using Probabilistic Graphical Models. Then it is learned byExpectation Maximization Algorithm, and a classifier for some object catego-rization is gained. This model is a generative model, so each model is createdspecifically for each category.A representation which is based on'key features'typically extracts fea- ture vectors by applying some'interest operator'to the image. Local featureshave been used very successfully in the development of current categorizationsystems. Categorization from local features is one of the core topics and thevarious aspects of this approach are discussed throughout this article. Whilemany detection and description methods have been used for decades, objectcategorization has shed a new light on these algorithms, and there has been asignificant number of new contributions and of review articles in the past fewyears. Detectors can help to reduce the amount of data to be processed, focusingattention on salient image events as salient points, lines/edges, and regions ofhomogeneity. When salient regions or points with their supporting regions havebeen detected, the obvious way to proceed is to try to come up with a represen-tation of such salient regions in terms of descriptive features. Features are re-quired for a feasible correspondence analysis. Typically, a descriptor of a salientregion will be comprised of a number of different features, often represented asa feature vector. There are several possibilities to obtain descriptors which areinvariant or at least robust in response to certain distortions. Another possibilityis to use descriptors which themselves are invariant against distortions and canbe used on arbitrary patches which are extracted by any saliency detector. Thus,normalization and/or invariance can occur in both of the processes of detectionand/or description. The resulting combination of detector and descriptor can beinvariant to scaling, rotation, affine deformation and change in illumination, de-pending on the amount of geometric and radiometric distortion that should becompensated in the application.A graphical model is a family of probability distributions defined in termsof a directed or undirected graph. The nodes in the graph are identified with ran-dom variables, and joint probability distributions are defined by taking productsover functions defined on connected subsets of nodes. By exploiting the graph-theoretic representation, the formalism provides general algorithms for comput-ing marginal and conditional probabilities of interest. Moreover, the formalism provides control over the computational complexity associated with these opera-tions. The graphical model formalism is agnostic to the distinction between fre-quentist and Bayesian statistics. However, by providing general machinery formanipulating joint probability distributions, and in particular by making hierar-chical latent variable models easy to represent and manipulate, the formalismhas moved to be particularly popular within the Bayesian paradigm. ViewingBayesian statistics as the systematic application of graph-theoretic algorithmsto probability theory, it should not be surprising that many authors have viewedgraphical models as a general Bayesian"inference engine". What is perhapsmost distinctive about the graphical model approach is its naturalness in for-mulating probabilistic models of complex phenomena in applied fields, whilemaintaining control over the computational cost associated with these models.The simplest possible object model is to use no model at all.'bag of key-points'has also been termed'model-free'or'geometry-free'. The basic idea isto extract salient points (keypoints) from images, and to represent an image asa set of such keypoints including some descriptor. In the first step, a set of key-points and their descriptors (feature vectors) can be extracted from all imagesthat are presented to a'bags of keypoints'categorization system. Next, classi-fiers have to be found that can discriminate between the various categories. Anumber of successful categorization systems of this type have been presentedover the past years. Sivic et al. achieve this using a model developed in thestatistical text literature: probabilistic Latent Semantic Analysis (pLSA). In textanalysis this is used to discover topics in a corpus using the bag-of-words docu-ment representation. Here we treat object categories as topics, so that an imagecontaining instances of several categories is modeled as a mixture of topics.However, there is one major drawback inherent to this approach: When thetraining images are not restricted to a pure presentation of the objects them-selves(cropped out from the background, or in front of a prepared, homoge-neous background), object localization will generally be poor. In this article, we pesent a novel approach called'Local Spacial Relations pLSA'(LSR-pLSA).This model is presented as a novel technique of learning spacial models forvisual object categorization. Combined the local spacial relations model withstatistical visual words and expectation maximization, LSR-pLSA is developedas an implementation of object classification algorithm. LSR-pLSA uses anunsupervised process that can capture both spacial relations and visual wordsappearances simultaneously. In contract to other methods which explicitly givesome parameterized spacial models, the proposed algorithm uses a latent topicdiscovery model to reveal some certain latent spacial relations. It uses an unsu-pervised learning paradigm which can avoid some manual controls. It can resistsome geometry transforms and is a dense model. The spacial relations are latentwhich have more insight into describing the object structure.The experiments are demonstrated on some standard datasets and showthat LSR-pLSA is a promising model for solving visual object categorizationproblems, especially for translation, rotation, scale, affine and occlusion.
Keywords/Search Tags:Probabilistic
PDF Full Text Request
Related items