Font Size: a A A

Finding Birds in Trees: Building Categories from Image Streams

Posted on:2011-05-23Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Ko, Teresa HFull Text:PDF
GTID:1448390002969447Subject:Computer Science
Abstract/Summary:
We explore automated object detection and categorization in image sequences within the context of natural environments . Inherent in these environments are significant challenges to be modeled---for example, complex texture, background motion, and object mimicry. We present a general background model that is applicable to natural scenes. Our approach models the underlying warping of pixel locations arising from background motion. The background is modeled as a set of warping layers where, at any given time, different layers may be visible due to the motion of an occluding layer. Foreground regions are thus defined as those that cannot be modeled by some composition of some warping of these background layers. We illustrate this concept by first reducing the possible warps to those where the pixels are restricted to displacements within a spatial neighborhood, and then learning the appropriate size of that spatial neighborhood. Then, we show how changes in intensity/color histograms of pixel neighborhoods can be used to discriminate foreground and background regions. We find that this approach compares favorably with the state of the art, while requiring less computation.;We have designed and implemented a system for cataloging putative objects of interest into viewable clusters from an image sequence and user input. We introduce two object representations. One is a set of feature histograms, each corresponding to a viewpoint of the object. The other is an object barcode that represents whether or not a feature is present across all views. The approach is unbiased towards redundant views---that is, it does not matter how many times an object appears from the same viewpoint. At the same time, the approach does not penalize for missing views---so that successful object categorization does not require capturing all viewpoints. We use these representations to cluster objects into viewable clusters that users can label according to the categories of their interest. We then feed these labels back into the system to automatically label new objects that appear in the image sequence. We find that the system significantly reduces the amount of time users would spend looking at uninformative images.
Keywords/Search Tags:Image, Object
Related items