Font Size: a A A

Research On Venue Semantic Modeling Algorithm Based On Multimodal Data

Posted on:2019-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:W J PengFull Text:PDF
GTID:2428330623962497Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Multimodal media data refers to the information contained in a scene or a thing from multiple aspects,including text,image,video,audio and other media forms,each of which is a modality.With the rapid development of the Internet,information dissemination has entered the era of globalization.The diversification of media data has enabled various locationbased social networks to flourish,allowing users to share anytime or anywhere.Just as a saying goes “A scholar does not step outside his gate,yet he knows the happenings under the sun”,the explosive growth of massive data makes the research on multimodal media data has become a hotspot at this stage.While the data of different modes are synergistically complementary,there are also big differences.The existence of the "semantic gap" problem makes multimodal data processing have many difficulties.Aiming at the problems in multi-modal data processing,this paper proposes a venue semantic modeling method based on multi-modal media data,introduces deep learning into multimedia data analysis,and integrates multiple modal data for semantic modeling.First,use the Scrapy framework to crawl the multimedia data of Foursquare and Flickr websites,and train the content detector through the CNN-LSTM network to solve the inconsistency of data between different modalities;then use the relationship between data to construct multi-modal heterogeneous graphs;Finally,the semantic information of the site is extracted by the method of graph classification.Based on the obtained site semantics,this paper proposes three applications,image scene prediction,site semantic summary,and image automatic annotation.To verify the effectiveness of the proposed algorithm,a large number of experiments were performed on cross-platform datasets.The experimental results show that the proposed semantic modeling algorithm can realize the image scene prediction at the venue level,and provide more diverse and complete site semantic summary information,at the same time can realize automatic labeling of images.Further illustrate the superiority of the algorithm.
Keywords/Search Tags:Multimodal, Graph model, Venue prediction, Semantic extraction, Image annotation
PDF Full Text Request
Related items