Font Size: a A A

Research On Video Summarization Method Based On Deep Learning

Posted on:2024-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2568307151467084Subject:Communication Engineering (including broadband network, mobile communication, etc.) (Professional Degree)
Abstract/Summary:PDF Full Text Request
With the widespread application of computer networks and the rapid development of multimedia technology,humanity has entered the era of big data,and online video data is growing rapidly.Video summarization task is a research problem based on computer vision and natural language processing,that is,how to summarize the main content of the original video with shorter and longer video frames.At present,many research methods for video summarization have been proposed,but how to maximize the fusion of contextual information,achieve information exchange,and make the generation of summaries more representative has become a challenge to be solved.This article proposes three research models,which use deep convolutional networks for frame feature extraction and combine them to generate video abstracts.Firstly,this article proposes a video summarization algorithm based on gated convolution,which utilizes a fully convolutional network to generate summaries.This method first utilizes a deep convolutional network for video feature extraction,and then uses a fully convolutional sequence network to fuse multi-scale contextual feature information,expressing the video abstract as a sequence labeling problem.In the fully convolutional sequence network,this paper introduces gated convolution,which effectively reduces gradient dispersion and can better capture remote dependencies.Finally,through experimental verification,the proposed method achieved good results on the TVSum dataset.Secondly,this article proposes an unsupervised video summarization method based on Moglstm.To address the issue of sparse datasets in the field of video summarization,this method utilizes an unsupervised processing approach to generate higher quality abstracts by guiding the diversity and representativeness of generated abstracts through reinforcement learning.In addition,in order to realize the modeling of richer interaction space between input information and context and reduce the loss of context information,this paper selects Moglstm to process temporal features.The feasibility of this method in the field of video summarization has been demonstrated through experiments on two datasets,TVSum and Sum Me.Finally,this article proposes a video summarization method based on multi branch networks.In order to increase the amount of information and achieve prediction of longdistance information,this paper adopts deep convolutional network fusion attention mechanism for video frame feature processing,and constructs a multi-branch network for temporal information modeling,which can simultaneously focus on long and short video information and predict its importance score.In response to the issue of insufficient comprehensive standards when evaluating the generation of abstracts,this article adds evaluation indicators to further evaluate the richness of the generated abstracts.
Keywords/Search Tags:video summarization, gated convolution, Moglstm, multi-branch network, new evaluation metrics
PDF Full Text Request
Related items