Font Size: a A A

A framework for indexing higher-level content in natural images: A study on the far side of the semantic gap

Posted on:2005-12-15Degree:Ph.DType:Dissertation
University:Arizona State UniversityCandidate:Black, John Arthur, JrFull Text:PDF
GTID:1458390008986990Subject:Computer Science
Abstract/Summary:
Content-based image retrieval has been an active area of research for over a decade. However, the performance of content-based image retrieval systems is still regarded as unsatisfactory. The primary reason for this is that image retrievals are typically based on low-level content (such as color, and/or the spatial frequency and orientation of textures within images) while humans consciously perceive the content of images at a much higher level. This research is based on a conceptual framework that recognizes 4 distinct levels of content: (1) image content, (2) visual content, (3) semantic content, and (4) affective content. Image content can be extracted from images, using image processing techniques. Visual content triggers first-order feature detectors in the primary visual cortex---thus triggering percepts. Semantic content evokes visual concepts in the mind of the viewer. Affective content evokes "feelings" or "emotions" or "impressions" in the viewer. Within each of these layers of content are multiple types of content. This study asks human participants to use two sets of words (called lexical basis functions) as frameworks for measuring the semantic and affective content in a set of outdoor scenery images, called the NaturePix image set. This produces a semantic content vector and an affective content vector for each image. When distance metrics are applied to these vectors, the resulting semantic clustering and affective clustering of the images is found to correlate very well with subjective clustering, as measured with two independent ground truth procedures. These clusters are also shown to be cognitively coherent by labeling each cluster with a short phrase, and then asking human participants to use these phrases to choose images from the NaturePix image set. Thus, the results of these experiments show that lexical basis functions provide a means for indexing the content of the NaturePix image set in a way that correlates with similarities perceived by human participants. If detectors can be designed to detect the types of the semantic and affective content represented by these lexical basis functions, they could provide a means for indexing the high-level content of images in a manner that is more satisfactory to humans.
Keywords/Search Tags:Content, Image, Semantic, Indexing, Lexical basis functions
Related items