Font Size: a A A

Information Guided Video Summarization

Posted on:2020-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhaoFull Text:PDF
GTID:2518306518467194Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the increasing amount of online video data,it has become time-consuming and exhausting for users to retrieve,browse,query and manage video data.Video summarization,as a significant technology of video content concentration,is to improve the efficiency of user browsing,query,retrieval and to enhance user experience.From the perspective of information guidance,this dissertation studies two different forms of single video summarization and multi-video summarization,respectively.Firstly,to solve the problem that the existing single video summarization based on sequence model cannot fully capture the discriminant information inside the sequence,and don't conform to the requirements of user summarization as much as possible in terms of distribution consistency,we proposed a distribution consistent video summarization algorithm based on attention information guidance,where the deep attention model which is composed of encoder self-attention and decoder based on attention mechanism is adopted to enhance the information with more inter-frame visual discrimination intra-and inter-sequence,and the loss function of distribution consistency strategy is introduced to optimize the quality of summarization.Secondly,aiming at the deficiency that existing multi-video summarization algorithms based on traditional machine learning and other approaches do not dig into the potential diversified information,we propose a multi-video summarization algorithm for auto-encoder based on cross-modal fusion information guidance.As for multi-video,we constructs a visual text fusion modality's information guidance model,where the fusion mode information is used to guide the visual features,and then the guided features information is used as the importance constraint to train the sparse autoencoder and finally we extract the key frames of multi-video summarization.Extensive comparative experiments and performance analysis are conducted on the single video dataset Sum Me and TVSum as well as the multi-video dataset MVS1 K,respectively.Compared with some mainstream summary algorithms,the evaluation results prove the feasibility and effectiveness of the two proposed algorithms.
Keywords/Search Tags:Video summarization, Attention mechanism, Cross-modal fusion, Information guidance
PDF Full Text Request
Related items