Video Fusion Analyzing And Semantic Understanding

Posted on:2007-11-11

Degree:Master

Type:Thesis

Country:China

Candidate:H Yang

Full Text:PDF

GTID:2178360182466705

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The explosive growth of video data in large-scale information repositories such as digital library and World Wide Web poses new challenges to conventional video analysis and information retrieval technologies. A typical information repository features a huge amount of heterogeneous data and a large number of non-expert users. Therefore, the video retrieval tool that supports semantic-level queries is highly desirable in the context of repository, which is however beyond the capability of the up-to-date video retrieval technologies. The work presented in this paper extends the conventional video analysis and retrieval research by proposing two key techniques for digital library: the video multimodal fusion analysis and the video automatic annotation and retrieval. These techniques provide critical building blocks for video analysis and retrieval facilities in digital libraries or similar information repositories.We present an overview of the background, motivation, and basic approaches of our research in the beginning of this paper.In Chapter 2, we present a review of the fundamental research on video structure and shot boundary detection. We also introduce and discuss the most advanced research on multimodal fusion analysis, automatic annotation and semantic-level retrieval in the area of video, including the basic approaches, related techniques and typical systems.A novel algorithm for video multimodal fusion analysis based on the Maximum Entropy model is proposed in Chapter 3. Video contains rich semantic information which can be represented by multimodal features including text transcripts, visual/audio features. Here, the Maximum Entropy model is implemented based on the multimodal features, and is used for video semantic understanding and story segmentation.Chapter 4 describes a comprehensive mechanism for video automatic annotation. People have traditionally used manual annotation for linguistic indexing to support semantic-based video management and retrieval. However, this approach is liable to subjective and requires a huge amount of human effort, especially for large video collections. Our automatic annotation method with temporal information employs the corresponding transcripts obtained from the broadcast news, which can make a deeper semantic understanding and improve the search results.In Chapter 5, we propose a video analysis and retrieval system based on our previous work, which includes the off-line video fusion analysis system and the on-line video semantic retrieval system. The whole application has been tested in the digital library, and the result shows that the proposed solution can effectively improve the efficiency of video retrieval.We conclude the whole paper in Chapter 6, with a brief discussion of the application prospect and future research directions.

Keywords/Search Tags:

multimedia, video, multimodal, fusion analysis, semantic annotation, statistical learning, information retrieval, digital library

PDF Full Text Request

Related items

1	Image Annotation And Retrieval In A Digital Library
2	Multimedia Cross-reference Retrieval And Semantic Annotation,
3	Multimodal Multimedia Data Analysis And Key Technology Research
4	Semantic Information Extration And Analysis Of Digital Video
5	Video Semantic Annotation Methods And Theoretical Research
6	Research And Implementation Of Multimodal Information Fusion Annotation System For Image-Text Mixed Data
7	Research On Web Image Retrieval Based On The Fusion Of Textual And Visual Information
8	Research On Sports Video Content Analysis Technology
9	The Research Of Content-Based Video Media Information Retrieval Method And Frame
10	Study On Information Retrieval Model For Digital Library Based On Semantic Web