Font Size: a A A

Analysis Of Guilin Tourists' Satisfaction Based On Data Mining

Posted on:2018-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:G Y LiFull Text:PDF
GTID:2348330518456349Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the development of the Internet age,many tourists are keen on the network platform to publish the experience and evaluation of travel,resulting in a massive amount of data on visitors to the network.If the tourism sites and related departments want to enhance operational efficiency and improve the tourism environment,it is necessary to dig out useful data from the information.In this paper,we use the data mining technology to study Guilin tourist network review of Ctrip online travel.The main research work and related conclusions are as follows:Firstly,the use of octopus collection based on the network reptile technology to study the Ctrip online network of visitors comment data capture rules and rules,collected 1260 comments on the site and Excel format derived,and then The original data set is used to remove the inattention of the invalid comment,and finally gets 1210,nearly 100,000 words of the visitor's web review as a sample dataset.Secondly,the visualization technology and the LDA theme model are used to analyze the characteristics of the collected visitor comment text data.Through the word cloud diagram intuitive and accurate to find high frequency words,combined with classification methods to determine the impact of tourists satisfaction factors;and then according to the network semantics intuitive given high frequency between the semantic relations network.Finally,we use the LDA theme model to extract the theme of the text dataset,and get the top 8 topics of the tourists'attention:itinerary,attractions,hotel,tour guide,dining,shopping,explanation and service.Thirdly,we construct the emotional dictionary suitable for the research needs of this study,and use the emotional dictionary to carry on the emotion analysis,use Python to calculate the emotion value of all the visitors' comments and find that 33.64%of the visitors have high loyalty.Fourthly,we first quantify the text data collected by the network,and then use the statistical method of correlation analysis and regression analysis to analyze the tourist comment data and establish the model.Then,we compare and analyze the data of tourists' reviews in different years and months.Finally,the above results are summarized,and the opinions and suggestions are put forward to the relevant departments of tourism in Guilin and related tourism websites,so as to provide reference for them.
Keywords/Search Tags:data mining, tourist satisfaction, visualization, text sentiment analysis, LDA topic model
PDF Full Text Request
Related items