Font Size: a A A

Content Analysis And Understanding Of Video Commercial

Posted on:2013-02-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:N LiuFull Text:PDF
GTID:1118330371959343Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As one of the most popular means of promoting products, video commercials have become an inescapable part of modern life, significantly influencing our work habits and other aspects of life. Due to the importance of video commercials, tens of thousands of commercials are produced and broadcasted on many TV channels to promote a variety of new commodities or services, costing billions of dollars.Meanwhile, benefiting from the rapid development of digital technologies, people can conveniently record more and more commercials for commercial information acquisition. However, the explosive growth of recorded commercials results in critical demands for the actual applications (e.g. commercial filtering, capturing, and indexing) of a smart commercial content analysis and understanding (CCAU) scheme for different user groups. It is deeply desirable to design an effective CCAU scheme to assist them in monitoring, browsing, and indexing daily updated commercials. This kind of research has become an intense focus in multimedia analysis.To alleviate the challenges of CCAU, some key issues in CCAU are explored by a series of state-of-art computer vision, machine learning, and multimedia processing technologies. Specially, we propose a variety of mid-level descriptors to describe the intrinsic commercial semantics from different modalities. In addition, aiming at collaboratively exploiting these cross-media characteristics, some effective techniques are well designed to boost the performance of the proposed CCAU methods. The following points highlight several contributions of this paper:1) Video commercial recognition using coarse-to-fine matching strategyAiming at improving the efficiency of video commercial recognition, a coarse-to-fine matching strategy is proposed resorting to the effective combination of locality sensitive hash (LSH) and the fine granularity successive elimination (FGSE). Specially, LSH is applied to accelerate the initial coarse retrieval procedure and FGSE is evolved into the means to eliminate rapidly those irrelevant candidates which have passed the coarse matching process.2) An enhanced co-training based video commercial text detection To pave the way for utilizing the video textual characteristics in commercials, we present an enhanced co-training based commercial text detection approach by interactively exploiting the intrinsic correlation of multiple texture representation spaces. Specially, to alleviate the problem of noise samples in co-training process, an enhanced co-training strategy combining with Bootstrap is proposed for improving the generalization ability of the classifier.3) Collaboratively exploiting visual-audio-textual characteristics for video commercial block detectionWe focus our research on commercial block detection by the means of collaborative exploitation of visual-audio-textual characteristics embedded in commercials. Rather than utilizing exclusively visual-audio characteristics like most previous works, some intrinsic textual characteristics associated with commercials but rarely presented in general programs are fully exploited via analyzing the spatio-temporal properties of overlay texts in commercials. Additionally, Tri-AdaBoost, an interactive ensemble learning manner is proposed to form a consolidated semantic fusion across visual, audio, and textual characteristics.4) Video commercial block segmentation based on the collaborative fusion of visual-audio-textual descriptorsAn effective commercial block segmentation method has been proposed by collaboratively fusing the visual-audio-textual descriptors. Additional informative descriptors including textual characteristics are introduced to boost the robustness in the detection of frame marked with product information (FMPI). Together with the audio characteristics, FMPI can provide a kind of complementary representation architecture to model the similarity of intra-commercial and the dissimilarity of inter-commercial. In addition, the relation among these multi-modal descriptors in temporal domain is further collaboratively utilized to segment commercial block into multiple individual commercials.5) Video commercial categorization using sparse coding based visual bag of words representationTo boost the discrimination ability of the traditional visual bag of words (VBoW) in commercial categorization, a more suitable representation method, i.e. sparse coding based VBoW, is presented to describe the co-occurrence of semantic units in different kinds of commercials. These semantic units are mapped into an over-completed dictionary and each commercial is further represented by the sparse liner combination of these atoms in the dictionary.
Keywords/Search Tags:video analysis, video semantics understanding, video commercialrecognition, video commercial block detection, video commercial semanticcategorization, video commercial text detection
PDF Full Text Request
Related items