Automatic segmentation, indexing and retrieval of audiovisual data based on combined audio and visual content analysis

Posted on:2000-05-12

Degree:Ph.D

Type:Thesis

University:University of Southern California

Candidate:Zhang, Tong

Full Text:PDF

GTID:2468390014461948

Subject:Engineering

Abstract/Summary:

A system was proposed in this thesis for automatic segmentation, indexing and retrieval of audiovisual data based on multimodal media content analysis. The purpose was to generate meta-data for video sequences for information filtering and retrieving. The audiovisual stream was demultiplexed into different media types such as audio, image and caption. An index table was generated for each video clip by combining results from content analysis of these diverse media types. Structures for different video types were described, and models were built for each video type individually. This general modeling and structuring of video content parsing is very unique. It achieves more functions than existing approaches which normally adopt a single model with focus on the pictorial information alone.; For content-based management of audiovisual data, a hierarchical system consisting of three stages was developed. In the first stage, the task of on-line segmentation and classification of accompanying audio signals into twelve basic types of sound was accomplished. The boundaries were precisely set, and an accurate classification rate higher than 90% was achieved. This procedure is generic and model free. In the second stage, fine-level classification of environmental sounds by using the hidden Markov model was performed. Experimental results showed that an accuracy rate of 86% was obtained. Finally, based on the classification approach, a query-by-example retrieval scheme for sound effects was proposed and proved to be very effective.; For content analysis of image sequences, an efficient and robust method was developed for shot change detection. This new method was derived from the twin-comparison algorithm with a new ingredient, i.e. the histogram difference of the Y- and V-components was incorporated. It was shown that the proposed method achieved both the sensitivity rate and the recall rate at around 95% with various kinds of test video. A scheme was also proposed for adaptive keyframe extraction based on histogram comparison. It was demonstrated by experiments that it could generate keyframes that properly represent the content of a shot.

Keywords/Search Tags:

Audiovisual data, Content, Segmentation, Retrieval, Proposed, Rate

Related items

1	On The Characteristics And Influence Of Reading And Audiovisual Content Of Micro - Letter
2	Content Based Multimodal Video Retrieval
3	Internrt Audiovisual Program Supervision System Analysis And Research
4	Research On Content-Based Video Shot Segmentation And Retrieval Technology
5	Semantic video object segmentation for content-based multimedia application
6	The Key Technique Research Of Content-Based Video Retrieval
7	Content-based visualization and retrieval for image libraries
8	Study Of Content-based News Video Retrieval System
9	A Number Of Technical Studies, Content-based Video Retrieval
10	Study On The Key Technology Of Large-scale Content-based Remote Sensing Image Retrieval