A probablistic framework for mapping audio-visual features to high-level semantics in terms of concepts and context

Posted on:2002-11-08

Degree:Ph.D

Type:Thesis

University:University of Illinois at Urbana-Champaign

Candidate:Naphade, Milind Ramesh

Full Text:PDF

GTID:2468390011496899

Subject:Engineering

Abstract/Summary:

Multimedia content is an essential part of information technology. However, the difficulty in filtering, searching, and summarizing video has so far hindered the effective utilization of video databases. Users want to filter and query video by high-level (semantic) concepts, while automatic algorithms can extract only low-level features (e.g., color, texture, shape, amount of motion). Bridging this gap is thus the most challenging problem in video (multimedia) indexing, retrieval, and filtering.; In this thesis we develop a probabilistic framework for mapping low-level features to high-level semantics. This framework consists of representation of concepts using probabilistic multimedia objects (multijects) and representation of contextual constraints of such objects called the multinet. To extract semantics from video, we approach the problem of video understanding as a pattern recognition problem. We develop a framework for modeling the concepts and the context of video content.; Innovative claims of this research include the following: (1) Efficient and generic probabilistic methods for generating a multimedia lexicon of multijects representing semantic concepts consisting of objects, sites, and events using audio and visual features. This is achieved by using probabilistic pattern recognition techniques for fusing multiple modalities. (2) Probabilistic graphical methods for discovering relationships between different semantic concepts and using these relationships to enforce contextual constraints. This leads to enhanced detection of existing concepts and facilitates detection of concepts not observable directly; To alleviate the burden of labeling large data sets for the supervised training of the multiject models, we also present a technique to train models using a partially labeled data set. To support query by audiovisual content, we also implement an algorithm based on the principle of dynamic programming. We also incorporate relevance feedback for this paradigm of query.; The framework is flexible and generic and can therefore be applied easily to applications such as semantic video indexing and filtering, human-computer intelligent interaction, surveillance, and Internet multimedia management.

Keywords/Search Tags:

Video, Semantic, Concepts, Multimedia, Framework, Features, Filtering, High-level

Related items

1	Research On Image Semantic Classification Techniques In Content-Based Image Retrieval
2	High-Level Video Semantic Concept Detection Based On Multiple Features
3	Bridging the semantic gap : Image and video understanding by exploiting attributes
4	Object Detection Based On Mid-level Features
5	Mining Semantics from Low-level Features in Multimedia Computing
6	Video Retrieval And Semantic Information Extraction
7	Video Low-level Features And Middle-level Semantic Representation Research
8	Automatic semantic indexing in multimedia information retrieval
9	Sports Video Semantic Content Analysis
10	Research On Cross-Level Saliency Detection Based On High-Level Semantic Feature