Font Size: a A A

Ideal observer analysis of object recognition

Posted on:1998-02-11Degree:Ph.DType:Dissertation
University:University of MinnesotaCandidate:Tjan, Bosco SiautungFull Text:PDF
GTID:1468390014975478Subject:Computer Science
Abstract/Summary:
General-purpose object recognition is an important function of the human visual system. We applied the quantitative techniques that have been successful for studying low-level sensory processing to the study of visual object recognition. We emphasized that the performance of a visual system depends on two factors: (1) how the system processes visual information, and (2) how much task-relevant information there is to be processed. By constructing an ideal observer for visual object recognition, we proposed four quantitative measurements to characterize these factors. Statistical efficiency measures the accuracy of a visual system relative to the informational constraints inherent in a task. View-processing rate gauges a visual system's processing speed with respect to a task's inherent requirements for representing the details of objects. These two measurements allow direct comparison of a visual system's processing capability across different visual recognition tasks. A third measure, stimulus informativeness determines the maximally achievable level of accuracy for a given task across all visual systems. Finally, view complexity quantifies a task's representational requirements, which are imposed on any observer. We applied these measurement techniques to resolve issues in object recognition research. Specifically, we (1) determined the factors that limit human processing capability for object recognition, (2) measured the limits on letter recognition for different fonts when they were presented in noise, spatial uncertainty and blur, and (3) measured the equivalent number of 2-D views required to represent different 3-D object ensembles to maximally facilitate recognition. We also proposed a framework, based on task partitioning, for the analysis of the general-purpose visual object recognition problem. This framework defines an object recognition task in terms of its input and context. In addition, it expresses an implementation of an object recognition task in terms of a network of other recognition tasks. The information sources for the input and context of each task in an implementation are made explicit by this framework.
Keywords/Search Tags:Object recognition, Visual, Task, Observer
Related items