Font Size: a A A

Research On Automatic Generation Of Video Barrage Comments Based On Multimodal Fusion

Posted on:2021-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y GanFull Text:PDF
GTID:2518306107468874Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of streaming media technology,online video resources are becoming more and more abundant.Users are no longer satisfied with just watching videos,but want to express their opinions.However,in most cases,the video and the comment are separated,which makes it difficult for the user to take care of both parties during the viewing process.In recent years,more and more video sites have provided a new feature of video barrage commenting.It is a time-related commenting method that is written and displayed in real time by the user when watching the video,expressing the user's immediate emotions and opinions,making the video more vivid and interesting.The automatic generation of video barrage comments can strengthen the interaction between the video and the user,increase the number of users and the video click-through rate.Therefore,the automatic generation of video barrage comments has great research significance.Firstly,through the analysis of the characteristics of barrage comments,proposes to use audio information,image information and existing barrage comments information in video to automatically generate barrage comment.Then a multimodal fusion-based video barrage comment automatic generation model(Multimodal Transformer,MMTF)is proposed,which uses an encoder-decoder network architecture and uses the improved Transformer model as the basis of the encoder and decoder structure,in which the encoder includes three parts: audio encoder,video encoder and context encoder,which are used for sequence analysis of audio information,image information and surrounding barrage comments information in the video,and the decoder is used for fusion encoding the output of the generator and the barrage comment are generated according to the probability generation model.Finally,by constructing a video barrage dataset MV-Comment,the scarcity problem of the dataset currently available for the automatic generation of video barrage comments for this task is solved.This dataset has the advantages of large number,high quality,and wide coverage.It is also suitable for various deep learning tasks related to videos and barrage comments,such as automatic generation of video barrage comments,analysis and detection of highlights in videos,and sentiment analysis of barrage comments.For the problems of whether the MMTF model is better,whether the audio information has an effect on the model,and whether the useless barrage comment filtering is effective,the experiment uses the MV-Comment dataset from these three aspects.First of all,only the image information and barrage comments information in the video are used for the experiment,and compared with the state-of-art methods using these two kinds of information,the results show that the MMTF model is better.Then it compares the MMTF model using audio information and not using audio information.The results show that using audio information can improve the effect of the model.Finally,by experimenting on the original dataset and the dataset with filtered useless barrage comments,the results show that filtering the useless barrage comments in the dataset can effectively improve the quality of the model generated barrage comments.
Keywords/Search Tags:Multimodal Analysis, Barrage Comments Generation, Transformer Model
PDF Full Text Request
Related items