Font Size: a A A

Micro-video Representation Learning Based On Complex Relation Modeling

Posted on:2021-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y W WeiFull Text:PDF
GTID:1368330602982460Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Micro-video is an emerging social media with the properties of short duration,easy operation,and convenience for sharing,which caters to the modern social network.As such,the micro-video industry has developed rapidly in recent ten years.After going through the budding preiod,its growth period has now enterted into the explosive period.However,with the growth of users,micro-videos,and markets,how to manage and organize these massive numbers of micro-videos becomes a huge challenge.In the thesis,I aim to leverage the machine learning technique to optimize the micro-video representation learning.It is able to supercharge the management and organization of micro-videos in au automatic and intelligent manner.Besides,the achievement can be extended to other relevant multimedia computing domains,such as the traditional video understanding,the multimedia recommendation,and the social network analysis.It contributes to solving certain research issues in these domains.Although researchers have conducted may studies on micro-video representation learning and obtained certain achievements,they are limited to explore the content information and ignore the complex relation during the learning.Therefore,considering the features of micro-videos,I explore and model the complex relation existing in the micro-video understanding and analysis to optimize the representation learning.Jointly investigating several applications,I thoroughly explore the complex relations in representation learning.From the angle of micro-video,the relation could be divided into three groups,including 1)the intra-relation among the different modalities in one micro-video.2)the inter-relation among different micro-videos,and 3)the outer-relation between micro-video and external social information.On the top of these relations,I further study the hybrid-relation within them,which is common in the real-world scenario.In summary,I focus on exploring the complex relation modeling based micro-video representation learning in this thesis and evaluating the efficiency of the achievement via deploying them to the micro-video understanding and analysis in practice.The contributions of this thesis are listed as follows:(1)Intra-relation modeling based micro-video representation learningI distinguish and define the consistent and complementary relation by exploring the correlation among modalities.In specific,a relation-aware multi-modal neural cooperative learning model is proposed to explicitly disentangle and model these two relations.It is the first attempt to design a multi-modal fusing strategy-according to the consistent and complementary relation.This strategy is verified to facilitate multi-modal information representation learning and improve the quality of micro-video representation learning.(2)Inter-relation modeling based micro-video representation learningThrough in-depth analysis of incomplete and insufficient representation learning in the micro-video personalized recommendation,I propose to discover the user intents hidden in the co-interacted micro-video pairs.Towards this end,a hierarchical user intent graph convolutional network is proposed on the cointeracted micro-video graph.By iteratively performing the graph aggregation and graph cluster operations,the user intents associated with their multi-level structure are modeled.Furthermore,the user and micro-video could be optimized with the obtained user intents.(3)Outer-relation modeling based micro-video representation learningExploring the complex relation in micro-video personalized hashtag recommendation,I propose to discover the correlation among user preference,micro-video,and hashtag semantics.Combined with the hashtag usage pattern,the relation between micro-video and social information is explicitly modeled to promote micro-video representation learning.A novel graph convolution network based micro-video hashtag personalized recommendation method is then presented,which extends the graph convolution operations to the tripartite graph.Moreover,by introducing the attention mechanism,it adaptively propagates the information between nodes in the graph for the micro-video representation learning.(4)Hybrid relation modeling based micro-video representation learningPursuing the high-quality recommendation,I simultaneously model the useritem relation and relations among multiple modalities.Based on the constructed user-microvideo bipartite graph in each modality,the multi-modal graph convolutional networks are proposed to explicitly model the modal-specific representation for each user and micro-video.Whereinto,I use the gradient backpropagation of shared parameters to accomplish the information transfer between multiple bipartite graphs.
Keywords/Search Tags:Representation Learning, Complex relation modeling, Micro-video Understanding, Micro-video Venue Category, Personalized Recommendation of Micro-video
PDF Full Text Request
Related items