Font Size: a A A

A Research On Popularity Prediction Of Tourist Attractions Based On Multi-source Heterogeneous User-Generated Data

Posted on:2020-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q DuanFull Text:PDF
GTID:2428330596975069Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Point of Interest(POI)is a term in geographic information system,generally referring to geographical objects,especially some geographical entities closely related to people's lives,such as restaurants and shopping malls.The points of interest in this research are particular to tourist attractions.POI popularity prediction aims at analysing the popularity condition in the next period based on the features of scene spots.This research is valuable and important,for it not only improves the accuracy of attractions recommendation and route planning for visitors,but also provides reference information to mining commercial value hidden in the unpopular spots.By analyzing the actual data,we find that the number of tourists in major scenic spots is unevenly distributed,and most scenic spots are sparsely populated.However,these so-called unpopular scenic spots are highly evaluated on the websites,which means that the value of these “unpopular” scenic spots may be greatly underestimated.The research on popularity prediction and its related applications currently focus on the prediction of network content,such as the repost of comments by users on weibo and twitter.Most researches pay little attention to the field of popularity prediction of tourist sites.Related work involving popularity of scenic spots mainly concentrates on popularity prediction of attractions with sufficient information.They ignore the significant influence of scenic spots or emerging scenic spots that lack of information but have potential on the tourism trend.Besides,there is no effective solution to the problem of unbalanced data distribution of scenic spots caused by great differences in real world and data scarcity of some attractions.There are three main challenges for POI popularity prediction working on real-world data.1)The description information of POI in social network is very sparse.Even in famous and widely-used websites,a large proportion of POIs only have few photos and/or associated text.Data scarcity and data imbalance bring great challenge to our study.2)Different types of attractions may be very similar in visual appearance and/or text description,which may bias in the classification and recognition of scenic spots.For instance,it is difficult to distinguish picking garden from urban park only through their images.Similar points of interest exist in real-world data,and traditional methods struggle to distinguish visual(semantic)ambiguity.3)Seldom researches have been made to effectively fuse multi-modal features from multiple sources to model POI data.To solve the above bottleneck,we propose a heterogeneous multi-clue hierarchical structure model that integrates multi-view learning,deep learning and other technologies.This model can simultaneously injects semantic knowledge as well as multi-clue representative power into POIs,which can be divided into “Topic layer”,“POI layer”,“feature layer” and “tag layer” from top to bottom.To effectively implement POI modelling,the first two layers fully exploit semantic information and complete preliminary classification for POIs.The third layer devices multi-clue representation for feature fusion and achieves popularity prediction.We obtain prediction result for each individual POI in the last layer.Specifically,in multi-clue feature processing,we utilize traditional early fusion methods,late fusion methods and several feature fusion methods based on deep multi-view learning.To simulate the real environment for more accurate model evaluation,we construct a multi-source POI dataset by collecting data of Sichuan province in China from four main-stream tourism platforms during 2006 to 2018.Extensive experimental results prove that the proposed multi-clue hierarchical model can significantly improve the performance of predicting the attractions' popularity.
Keywords/Search Tags:Social media, tourist attractions, popularity prediction, multi-source heterogeneous data, hierarchical model
PDF Full Text Request
Related items