Research On News Video Content Analysis Based On Multimodality Information

Posted on:2008-08-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z Ji

Full Text:PDF

GTID:1118360272485495

Subject:Signal and Information Processing

Abstract/Summary:

Semantic video management, including video browsing, indexing and retrieval, is necessary for the effective utilization of video repositories. Video content analysis technology aims to bridge the semantic gap between low-level features and high-level concepts, and to provide an accessible way to organize and manage video data.In this dissertation, research efforts are concentrated on audio, caption and visual content analysis and multimodality information fusion techniques for news video with pattern recognition models. The three main contributions are as follows:(1) A novel anchorperson shot detection algorithm in MPEG domain is proposed, in which an improved face detection method in compressed domain and a new dissimilarity metric for clustering are presented. The proposed algorithm is effective and computationally efficient.(2) A new video shot classification method is proposed using decision tree. Six semantic types are studied and categorized: Commercial, Others, Still Image, Anchorperson, Reporter and Monologue. The first three types are identified with features of black frame, motion activity, shot duration and face. The anchorperson shots are detected by clustering method. And the reporter and monologue shots are distinguished by conditional random fields (CRFs) model, where the detection is transformed into sequence labeling problem using audio, face, motion and temporal information. The experimental results demonstrate the effectiveness and high performance of the method.(3) A novel news story segmentation method is proposed, fusing multimodality information from the results of audio classification, caption extraction and video shot classification. The video shot sequence is transformed into several keywords sequences so that the news story segmentation is treated as a sequence segmentation problem. CRFs model is employed to fuse the context information within and between the keywords sequences. Experiments show that the idea is feasible and better result is achieved.Besides, various video content analysis techniques are surveyed, a layered audio classification method based on rules and HMM model is developed, a caption extraction framework for news video is designed and realized, and a COM-based video content analysis and abstraction system is devised and implemented in this dissertation.All in all, the dissertation provides an in-depth investigation into semantic concepts detection and multimodality information fusion.

Keywords/Search Tags:

News Video, Video Content Analysis, Anchorperson Shot Detection, Video Shot Classification, News Story Segmentation, Caption Extraction, Multimodality Information Fusion

Related items

1	Video-Based News Structure Analysis And Anchorperson Shot Detection
2	News Video Scene Segmentation Research
3	Research On Content-based Shot Classification In News Video
4	Research On Content-Based News Video Abstraction Technology
5	Research On Content-based News Video Abstraction Technology
6	Analysis, Based On The Detection And Extraction Of The News Video Subtitles
7	Research On Integrating Multimodal Information To Automatically Parsing News Video On The Compression Domain
8	Topic-based Subtitles Extracted From News Video Retrieval Research
9	Research On Content-Based Fast Video Shot Boundary Detection
10	Key Technique Research Of Automatic Segmentation News Video