Font Size: a A A

Meta data extraction for conceptual image indexing

Posted on:2005-02-05Degree:M.SType:Thesis
University:The University of Texas at ArlingtonCandidate:Ansari, Amber Abdul WahidFull Text:PDF
GTID:2458390008978918Subject:Computer Science
Abstract/Summary:
Concept based indexing of multimedia elements is becoming an important aspect of retrieval and storage of multimedia resources from the World Wide Web. The textual data on a HTML page containing a multimedia resource may contain useful metadata that can help identify concepts related to the multimedia element. This process is dependent upon the amount of metadata present and extracted from the context. In this thesis, we further enhance the context sensitivity of the 3C architecture, a content, context and common-background knowledge integration architecture for multimedia information access. To achieve this we propose and extract new contextual features. Using this enhanced set of features or metadata we find new named entities (concepts) and compute relevance score for a concept image pair. We further build a concept based index for the retrieved multimedia elements specifically for the travel domain and query using concept expansion mechanism. Central to concept expansion is the Latent Semantic Analysis algorithm which finds associated words or concepts (rather than related syntactic phrases) that may have not been present in the original web document. We train LSA algorithm with travel domain specific corpus and apply new filtering methods to produce new associated concepts thus allowing the user to perform conceptual queries in the travel domain.
Keywords/Search Tags:Concept, Travel domain, Multimedia, New
Related items