Font Size: a A A

Video Captioning And Video Segment Popularity Prediction Based On Short Video

Posted on:2020-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2428330596975068Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of digital devices and the continuous upgrading of online media,more and more people are willing to post videos online to share their daily lives.Based on the increasing amount of short video data,people don't have enough time to watch all the videos one by one,so there is an urgent need for an effective way for the machine to automatically analyze the information in the video and summarize and organize the video content.So that audiences can enjoy these videos more easily.Visual content description has been attracting broad research attention in multimedia community.Visual captioning task refers to inputting a piece of visual content to the machine,and the model can automatically generate a sentence describing the visual content.In this work,we propose a novel adaptive attention strategy for visual captioning,which can selectively attend to salient visual content based on linguistic knowledge.Specifically,we design a key control unit,termed visual gate,to adaptively decide “when”and“what”the language generator attend to during the word generation process.We evaluate the proposed approach on the commonly-used benchmark,i.e.,MSVD.The experimental results demonstrate the superiority of our proposed approach compared to several stateof-the-art methods.Generally,not all segments in a video are be attractive.Some of which may be boring.If we can predict which segment in a newly generated video stream would be popular,the audiences can only enjoy this segment rather than watch the whole video to find the funny point.And if we can predict the emotions that the audiences would induce when they watch a video,this must be helpful for video analysis and for guiding the video makers to improve their videos.In recent years,crowdsourced time-sync video comments have emerged worldwide,supporting further research on temporal video labeling.In this paper,we propose a novel framework to achieve the following goals:(1)predicting which segment in a newly generated video stream(hasn't been commented with the time-sync comments)will be popular among the audiences;(2)predicting which emotion would be induced by the audiences when they watch a newly released video.At last,experimental results on real-world data demonstrate the effectiveness of the proposed framework and justify the idea of predicting the popularities of segments in a video exploiting crowdsourced time-sync comments as a bridge to analyse videos.
Keywords/Search Tags:Deep Learning, Short Video, Time-Sync comments
PDF Full Text Request
Related items