Font Size: a A A

Video Summarization Based On Deep Learning Networks

Posted on:2019-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:J X WuFull Text:PDF
GTID:2428330566461589Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As video data grows explosively in recent years,browsing such a huge quantity of videos is time-consuming and tedious.As a matter of fact,video summarization is an ideal tool for people to watch the video in a rapid way,which provides a compact form of the input video.Generally,video summarization can be divided into two categories: static video summarization and dynamic video summarization.Static video summarization provides a static storyboard comprising of representative individual frames while dynamic video summarization is a kind of video skimming,which is consist of attracted and brief video shots.Video summarization,which appears as static or dynamic form,is able to help users to browse video content rapidly.In this dissertation,two methods are proposed to deal with video summarization task.Firstly,a novel clustering method is proposed for static video summarization.I propose an effective clustering method based on a high density peaks search clustering algorithm as well as by integrating some essential elements of the video.The proposed method gathers similar frames into clusters and finally static video summary is comprised of all clusters' center.The proposed method is evaluated on two static video summarization benchmark datasets.The experimental results show that the proposed method is able to generate a better summary when compared with other clusteringbased static video summarization methods.It is also proved that the proposed clustering algorithm is more stable and effective for static video summarization than several classical clustering methods.In addition,in this dissertation,I propose a novel foveated two-stream convolutional neural networks for dynamic video summarization.Foveated images are constructed based on subjects' eye movements to represent the spatial information of the input video.Multi-frame motion vectors are stacked across several adjacent frames to convey the motion clues.We are the first to integrate gaze information into a deep learning network for video summarization.To evaluate the proposed method,experiments are conducted on two video summarization benchmark datasets.The experimental results validate the effectiveness of the gaze information for video summarization despite the fact that the eye movements are collected from different subjects from those who generated summaries.Empirical validations also demonstrate that our proposed foveated convolutional neural networks for video summarization can achieve state-of-the-art performances on these benchmark datasets.
Keywords/Search Tags:Video Summarization, Deep Learning Networks, Clustering, Convolutional Neural Networks, Eye Movement Information
PDF Full Text Request
Related items