Font Size: a A A

Designing representational architectures in recognition

Posted on:2012-11-13Degree:Ph.DType:Thesis
University:University of Illinois at Urbana-ChampaignCandidate:Farhadi, AliFull Text:PDF
GTID:2458390008495588Subject:Computer Science
Abstract/Summary:
Recognition is a deep and fundamental question in computer vision. If approached correctly, object recognition provides insight to several interesting problems with crucial applications. In a typical setting, recognition is defined as the problem of learning about a fixed set of categories from training examples provided for those categories. At test time, then the problem is to which of those learned categories a test image belongs. This thesis tries to question the typical settings of recognition and shows remarkable achievements as a result of shifting our point of view to fundamentals of recognition.;In current settings, the final goal of recognition systems is to predict a list of category name tags for images. But there is more to recognition that a list of category names. Images exhibit a great deal of information that cannot be conveyed with a list of name tags. The main focus of this thesis is to produce richer descriptions for images. Inspired by how human describe images, our goal is to describe images with sentences. This thesis introduces a non-parametric approach for describing images with sentences that produces promising results. Exploring the idea of describing images with sentences raises deep and interesting concerns in recognition: how to deal with unfamiliar objects, how to describe objects, and how to recognize complex composites of objects.;This thesis introduces visual attributes and shows how the attribute-based recognition can reason about unfamiliar objects. The attribute-based recognition also allows description of objects, the reporting of unusual properties of familiar objects, and learning about novel categories with few or even no visual training examples (from pure textual descriptions of categories). Analogous to phrases in machine translation, this thesis also introduces visual phrases; elements of recognition that correspond to a chunk of meaning bigger than objects and smaller than scenes. Visual phrases exhibit such a characteristic appearance that makes detecting them as one entity much simpler and significantly more accurate than detecting the participating objects. This thesis shows that including visual phrases in the vocabulary of recognition results in significant improvements in recognition.;The work presented in this thesis tries to provide insight to deep and yet basic questions in recognition: What should we recognize? At what level should we recognize entities? What does learning about some objects reveal about other objects? What should we say when an unfamiliar object is presented? How can we learn to predict deviations from typicalities in categories? What should be the output of a recognition system? And what is the quantum of recognition?...
Keywords/Search Tags:Recognition, Categories, Images with sentences, Objects
Related items