Font Size: a A A

Object category recognition in real-world scenes

Posted on:2007-07-15Degree:M.A.ScType:Thesis
University:University of Waterloo (Canada)Candidate:Fazl Ersi, EhsanFull Text:PDF
GTID:2448390005477261Subject:Engineering
Abstract/Summary:
Object category recognition is one of the core capabilities of the human visual system. Yet, computer vision systems are far from reaching a comparable level of performance. Moreover, computer vision research has in the past mainly focused on the simpler and more specific problem of identifying known objects under different viewing conditions, however for object categorization, the goal is to recognize (and localize) unknown instances of object categories in the scenes.; The available approaches to visual object categorization are either part based, which use the appearance of object parts to perform recognition; or part and structure based, which incorporate also the geometry and shape information of the objects in recognition. The former approaches are capable of dealing with global transformations (e.g., translation, rotation, scale), but without the ability to localize object instances in the scenes, which is the main advantage of the latter approaches. In this thesis we propose a novel model for object category recognition in real-world scenes, which is a compromise between the two types of approaches. Images in our model are represented by a set of triangular labelled graphs, each containing information on the appearance and geometry of a 3-tuple of distinctive image regions, including a spatial position, scale and orientation. In the learning stage, our model automatically learns a set of codebooks of model graphs for each object category, where each codebook contains information about which local structures may appear on which parts of the object instances of the target category. A two-stage method for optimal matching is developed, where in the first stage a Bayesian classifier based on ICA factorization is used efficiently to select from the matched codebook, and in the second stage a nearest neighbourhood classifier is used to assign the test graph to one of the learned model graphs. Each matched test graph casts votes for possible identity and poses of an object instance, and then a Hough transformation technique is used in the pose space to identify and localize the object instances.; An extensive evaluation on several large datasets, not only validates the robustness of our proposed model in object category recognition and localization in the presence of scale and rotation changes, but it also shows that our proposed technique can be used completely or in part in other visual recognition problems, such as individual object recognition and face recognition.
Keywords/Search Tags:Object, Recognition, Visual, Scenes, Used
Related items