Font Size: a A A

Resarch And Application Of Unsupervised Video Summarization Method Based On Subtitle Semantics

Posted on:2024-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:H R SunFull Text:PDF
GTID:2568307130953139Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rise of the multimedia industry and the rapid increase in data volume on video platforms,the demand for video summarization technology is becoming increasingly clear.The function of video summarization is to generate summaries to improve the efficiency of video browsing,which is widely used in scenarios such as video auditing and data annotation.Traditional video abstracts mainly filter frame sets based on the features of image dimensions.In fact,if only the information of the image dimension is considered,a large amount of available text information will be lost.Therefore,how to efficiently and accurately extract the key Semantic information of subtitles in the video and use it to improve the shortcomings of existing video summarization technology is the main problem of this thesis.The purpose of this study is to form an unsupervised video summarization method based on video semantics by using Semantic information of subtitles in videos and combining unsupervised learning strategies.Specifically,firstly,a reinforcement learning algorithm based on semantic rewards is proposed to generate timeline summaries,solving the problem of semantic loss in dense subtitle scenes.Next,a cross modal self supervised video summarization algorithm based on time axis clustering is proposed to solve the problem of text data scarcity in cross modal video summarization.Finally,an intelligent video summarization system was designed and implemented based on the above algorithm.The main work of this article is summarized as follows:(1)A reinforcement learning video summarization algorithm based on semantic rewards is proposed for scenes with dense subtitles.This article designs two reward mechanisms to evaluate the quality of the timeline.The first reward mechanism is the semantic quality of timeline text,which comprehensively evaluates the structural integrity,structural richness,and thematic prominence of timeline text through abstract semantic representation technology and word frequency inverse text frequency algorithm.The second reward mechanism is the text diversity of the timeline.The abstract semantic representation technology is used to expand the timeline,complete the Semantic information,and evaluate the content diversity of the summary.At the same time,a strong diversity image summarization algorithm is proposed as an intermediate model to filter the redundant frames of the timeline mapping and obtain the final comprehensive summary.Through relevant experiments on the universal dataset and the independently constructed dense subtitle dataset,it has been proven that this method has significant advantages in dense subtitle scenes.(2)Timeline,as a fine-grained text information,differs from limited video metadata information in that it has advantages in both quantity and quality,and is closely related to image frames.This article proposes a self supervised video summarization method based on cross modal features for this feature.Firstly,the temporal cluster text feature sequence is matched with the corresponding frame feature sequence to form association strength labels between different modalities,and a cross modal association strength evaluation model is obtained.Secondly,by predicting the time series relationship between features of different modalities,a comprehensive coherence evaluation model for time axis clusters and frame sequences is obtained.Finally,the overall loss function of the summary is obtained based on the above two pre training methods,and the final optimization goal is determined.Through various experiments on a universal dataset,it has been confirmed that the results of our method are closer to the real abstract compared to the baseline model.(3)Based on the above research,this paper designs and implements an intelligent video summarization system.The system has implemented the main core functions of the abstract system,automatically selecting appropriate abstract methods based on different scenarios,and demonstrating the practical value of the abstract model proposed in this article.
Keywords/Search Tags:Video summary, Self-supervision learning, Reinforcement learning, Unsupervised method
PDF Full Text Request
Related items