Research On Venue Semantic Modeling Algorithm Based On Multimodal Data

Posted on:2019-12-20

Degree:Master

Type:Thesis

Country:China

Candidate:W J Peng

Full Text:PDF

GTID:2428330623962497

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Multimodal media data refers to the information contained in a scene or a thing from multiple aspects,including text,image,video,audio and other media forms,each of which is a modality.With the rapid development of the Internet,information dissemination has entered the era of globalization.The diversification of media data has enabled various locationbased social networks to flourish,allowing users to share anytime or anywhere.Just as a saying goes “A scholar does not step outside his gate,yet he knows the happenings under the sun”,the explosive growth of massive data makes the research on multimodal media data has become a hotspot at this stage.While the data of different modes are synergistically complementary,there are also big differences.The existence of the "semantic gap" problem makes multimodal data processing have many difficulties.Aiming at the problems in multi-modal data processing,this paper proposes a venue semantic modeling method based on multi-modal media data,introduces deep learning into multimedia data analysis,and integrates multiple modal data for semantic modeling.First,use the Scrapy framework to crawl the multimedia data of Foursquare and Flickr websites,and train the content detector through the CNN-LSTM network to solve the inconsistency of data between different modalities;then use the relationship between data to construct multi-modal heterogeneous graphs;Finally,the semantic information of the site is extracted by the method of graph classification.Based on the obtained site semantics,this paper proposes three applications,image scene prediction,site semantic summary,and image automatic annotation.To verify the effectiveness of the proposed algorithm,a large number of experiments were performed on cross-platform datasets.The experimental results show that the proposed semantic modeling algorithm can realize the image scene prediction at the venue level,and provide more diverse and complete site semantic summary information,at the same time can realize automatic labeling of images.Further illustrate the superiority of the algorithm.

Keywords/Search Tags:

Multimodal, Graph model, Venue prediction, Semantic extraction, Image annotation

PDF Full Text Request

Related items

1	Multimodal Multimedia Data Analysis And Key Technology Research
2	Research On Semantic Knowledge Extraction For Domain-Specific Images
3	Research On Image Semantic Annotation Based On Sequential Prediction Learning
4	The Study Of Semantic Annotation Technique Of Insects Image
5	Semantic Image Retrieval Based On Automatic Image Annotation And Translation
6	3D Model-based Semantic Annotation Management Research And Application
7	Research On The Automatic Image Annotation And Annotation Refinement Algorithms
8	Semantic Annotation For Documents In Professional Domain Based On NLP
9	Research On 3D Model Semantic Auto-annotation Based On Ontology
10	Semantic-based Image Multiclass Annotation