Font Size: a A A

Research On Video Summarization Based On Semantic Content Understanding

Posted on:2022-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2518306311461634Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the way of information interaction between people has undergone tremendous changes.As multimedia data such as images and videos are more vivid and lively,they can better meet the needs of media users.However,with the explosive growth of multimedia data,a series of problems about information retrieval and space storage have been caused.By using a computer to automatically extract key image frames or video segments from the original long video as the key summary content,the summary video contains the most effective information for people,and at the same time shortens the total length of the video,to save time of users from obtaining effective information.Video summarization can be used for subsequent video classification,video retrieval,and efficient video storage and transmission.Because it is helpful for quick and effective video content understanding,video summarization has gradually attracted widespread attention from researchers in the field of computer vision.However,there are two major technical problems and challenges in the field of video summarization.First,due to the inconsistency of the video category,shooting content,time length,and shooting conditions,it is difficult to determine the important parts of the video content.Therefore,the diversity of video content is a major challenge for video summarization technology.The big challenge.Secondly,because different users have different concerns on the same video,users have inconsistent judgments on the important content of the video and demand for summary results.Therefore,user subjective needs and evaluations are the second major challenge of video summarization technology.Therefore,in response to the two major challenges mentioned above,three video summarization algorithms based on semantic content understanding are proposed in this article.The main idea is to enhance the ability of the video summarization model to understand video content and designed for the user's subjective summarization needs.(1)Proposed an unsupervised video summarization model based on feature pyramid network.This method regards video summarization as a sequential continuous decision-making process.We improved the full convolutional neural network of the model used for image semantic segmentation and designed a pyramid structure model for feature analysis.The video summarization technology is realized through the unsupervised strategy of reinforcement learning.The experimental results on the two general datasets SumMe and TVSum verify the effectiveness of the combination of the feature pyramid structure prediction model and the unsupervised reinforcement learning strategy.(2)Proposed a video summarization model based on multimodal feature fusion.This method regards video summarization as a sequence-to-sequence mapping problem.The summary framework is constructed by multimodal fusion feature and the LSTM encoder-decoder structure.The experimental results on two general datasets SumMe and TVSum prove the effectiveness of multi-modal feature fusion.(3)Proposed a query-based video summarization with multi-label classification network.This method regards video summarization as a object-based multi-label classification problem.The correlation between the video content and the labels is predicted through the convolution feature of multi-layer perceptron.Then,the predicted probability is weighted by the correlation of the labels,and finally the part of the video content with the highest relevance to the user's query sentence is selected as the video summary output.The experimental results on the query video summary dataset UT Egocentric prove the superiority of the algorithm.In addition,for the algorithm in this chapter,a corresponding user interaction system is designed and implemented.
Keywords/Search Tags:video summarization, deep learning, feature fusion, user subjectivity, multi-label classification
PDF Full Text Request
Related items