Font Size: a A A

Research On Relevant Techniques Of Web Video Based Thumbnail Recommendation And Topic Detection

Posted on:2016-01-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:W G ZhangFull Text:PDF
GTID:1108330503469573Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technologies and the prevalence of mobile video recording devices, various network platforms, such as video sharing websites, news websites, etc., have become important channels for accessing and spreading information. A mass of web videos coexist with other multimodal information such as text and image in the network, which are closely related to people’s real life and can reflect hot topics and important events arising in the real world, thus receiving great attention. How to analyze, understand and represent these cross-media data effectively and intelligently is becoming an important research issue in the field of multimedia technology. With web videos and other related multimodal information, this thesis focuses on the intelligent understanding and representing methods of key information in cross-media data and proposes solutions to some concrete problems including video shot boundary detection, video keyframe extraction, the generation and recommendation of web video thumbnails and multimodal topic detection.For the low-level structure analysis of web videos, focusing on the challenges of locating the dissolve that occupies a large proportion of gradual shot transitions in videos, an image quality based coarse-to-fine method for video dissolve detection is presented. It is difficult to detect dissolve due to the fluctuating dissolve model and the varying dissolve length. By discovering that the video dissolve transition shows the "clarity-blur-clarity" visual pattern and the corresponding "high-low-high" groove pattern in the video image quality, the proposed method detects dissolves by utilizing image quality evaluation, coarse detection to obtain candidate dissolves, dissolve length normalization and SVM based fine detection. Experimental results on standard datasets validate its feasibility and effectiveness. In addition, considering that it is difficult to dynamically and adaptively determine the number of keyframes in video keyframe extraction, an unsupervised clustering method is proposed to extracting video keyframes, which does not need prior parameters and therefore can avoid the troubles caused by some traditional strategies such as threshold selection and cluster number determination.For the mid-level compact content representation of web video, this thesis proposes a novel web video thumbnail generation method based on visual content perception. Existing methods in the literature usually have several problems. For example, the generated video thumbnails are blurred; the salient objects in them are too small to be clearly perceived; they are not very related to video topics. To address these problems, the proposed approach takes into account three key elements in the performance evaluation of thumbnail generation: image quality, visual accessibility and video content representativeness. It generates much clearer, more intuitive and more content-related video thumbnails by image quality assessment of video frames, visual salience based accessibility analysis and similarity calculation based video content representativeness evaluation. The experimental results and the user study based comparisons on some web videos indicate that the proposed method can effectively and automatically generate video thumbnails, which have great quality and meet the requirements of practical application.In the recommendation of web video thumbnail, high quality is not the unique requirement considering that personal recommendation is also extremely important because the users who upload the videos(called video owner) and the watchers who view the videos(referred to as video browser) have different recommendation needs due to their different perception. To achieve personal recommendation, inspired by the above-mentioned visual content perception based video thumbnail generation method, this thesis proposes an unified web video thumbnail recommendation framework that fuses visual content analysis and user query matching. This framework can automatically and adaptively recommend thumbnails for both video owners and video browsers by exploiting image quality assessment, SVR based image accessibility analysis, mutual reinforcement based video content representativeness calculation, and query-sensitive matching in light of users’ search intent. Subjective evaluation results demonstrate that the proposed approach can not only recommend clear and representative thumbnails dynamically and effectively, but also decrease the preference difference between video owners and video browsers, thus improving their view experience.In the detection of high-level semantic web topics, facing cross-media data including web videos data and news report documents, this thesis proposes a multimodal web topic detection approach that combines visual and textual information. Considering the data disparity and the lack of modalities intrinsic to web video data, and the topics’ inherent characteristics such as multi-granularity, sparsity, and lack of guiding information, the proposed method takes full advantage of multimodal information, including web videos and their surrounding texts(titles, tags), news report documents(titles, news images), and obtains the final web topic detection results by coarse detection based on weighted dense keyword groups, textual linking, visual linking, the refinement of extracted coarse keyword groups, the clustering of keyword group related documents. Experiments are conducted on two cross-media datasets, namely CM-NV and MCG-WEBV. The results demonstrate that the proposed method is effective in cross-media web topic detection, and can make multimodal information complement and reinforce each other, thus making the performance of topic detection more accurate and comprehensive.After obtaining the web video thumbnails and web topic related document set(including web videos and news report documents), these clues can be used to realize the visual presentation of web topics and help users to catch relevant events easily in a more vivid and complete way.
Keywords/Search Tags:video shot, keyframe, web video thumbnail, weighted dense keyword group, topic detection
PDF Full Text Request
Related items