Font Size: a A A

Research On Method Of Video Structure Mining Based On Content

Posted on:2009-05-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:C J FuFull Text:PDF
GTID:1118360278456570Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Advances in multimedia technologies have yielded a vast amount of video data. The omnipresent video data calls for efficient and flexible methodologies to annotate, organize, store, and access video resources. Video mining has attracted much research interest in recent years. It is defined as the process of discovering the implicit and previously unknown knowledge or interesting patterns from a massive set of video data. By means of data mining, the higher-level structure knowledge of video is explored at two aspects of syntax and semantics in this thesis. The main content and innovations are as follows:(1) The theoretical research on the concepts and methods of video structure mining. Based on the theories of traditional data mining and multimedia data mining, the concepts of video structure mining are defined explicitly in this thesis. The video structure mining mainly includes basic structure mining, structure syntax mining and structure semantics mining. The basic structure mining is the base of the structure syntax mining and the structure semantics mining. The structure syntax mining and the structure semantics mining supplement each other. A system structure of video structure mining is proposed, which includes pre-processing of video data, establishing video database, multi-dimensional analysis of video data, video mining function module and video mining interface. A functional structure of video structure mining is proposed, which includes data preprocessing, basic structure mining, structure syntax mining, structure semantics mining, patterns evaluation and knowledge representation, etc.(2) The research on content-based video basic structure mining methods. In order to obtain the hierarchical structure, which includes frame, shot, scene and video program from video, a framework of the video basic structure mining is proposed. This framework includes shot boundaries detection, key frame selection, video shot feature extraction and video scene segmentation, etc. This thesis focuses on the algorithms of video shot boundaries detection and video scene segmentation, so as to structuralize video stream. By using HSV (Hue, Saturation, Value) color space to do quantification of unequal distance, an algorithm of video shot boundaries detection using adaptive threshold by two-histogram and twice-differentiation is proposed to partition a video into shots. Based on the similarities of HSV color histogram, homogeneous texture descriptor (HTD) and edge histogram descriptor (EHD) among video shots, two video scene construction methods are developed, the one is shot clustering approach based on multi-features and the other is shot segmenting approach based on force competition. This thesis also discusses audio assistance in video structure mining and presents a method of news story unit segmentation using the speaker identification by the voice feature in news videos. (3) The research on content-based video structure syntax mining methods. This thesis presents a framework for video structure syntax mining. Based on shot segmentation, an improved method of frequency sensitive competitive learning (FSCL) is put forward to achieve unsupervised shots clustering and transform video stream into symbol sequence. With regard to the characteristic such as item's order correlation, time correlation and without explicit transaction concept in video symbol sequence, calculating support based on temporal window, improving traditional apriori algorithm, a video association rule mining algorithm is proposed to exploit the periodic or semi-periodic structure syntax pattern in videos by frequent set from the transformed cluster sequence.(4) The research on content-based video structure semantics mining methods. This thesis presents a video structure semantics model composed of three semantics levels and two inter-level mappings to bridge the semantic gaps between the low level features and the high level semantics. Between the low-level feature and high-level user's demand, this model adds the shot semantics concept. From the mapping of the low-level feature to shot semantics concept, this thesis applies discriminative random fields (DRF) model to shot multi-concepts annotation, and puts forward multi-concepts discriminative random fields (MDRF) and generalized MDRF models to detect semantics concepts in video shot. In our system framework of higher layer semantics events mining, structure syntax knowledge is extracted from video structure syntax to decide the model structure, the shot layer semantic concepts cues are treated as models observations, and hierarchical hidden markov models (HHMMs) are built and trained to infer the events from the cues. Through the way of incident reasoning, it fulfills the mapping of shot layer semantic concepts to higher layer video semantic incident.This thesis focused on video structure mining based on content. It set up the theories and framework of video structure mining and explored the methods and application of video structure mining from three gradations, that is, video basic structure mining, video structure syntax mining and video structure semantics mining. It will not only bring positive influence on multimedia data mining, but also establish theoretical and practical values for other correlative researches.
Keywords/Search Tags:Multimedia Data Mining, Video mining, Video Structure Mining, Video Basic Structure Mining, Video Structure Syntax Mining, Video Structure Semantics Mining, Video Association Rule Mining
PDF Full Text Request
Related items