Font Size: a A A

Data Analysis Of Tourism Review In Yunnan Province Based On Deep Learning

Posted on:2021-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z WanFull Text:PDF
GTID:2518306230480104Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
With the development of deep learning,semi-structured and unstructured text sentiment analysis has become more and more valuable in management,public opinion analysis and early warning systems.It discards the traditional analysis methods that needs manual feature engineering,and there are changes in the nature of the model structure.Under the analysis and comparison of the same data,Neural network models' performance is better than other statistical learning models.This paper discusses text sentiment analysis based on deep learning.The first point is the optimization of the structure of the unbalanced data.The unbalanced data will be sampled and segmented.We use the weighted sum of multiple sub-models to build an integrated model to achieve the purpose of reducing effect of overfitting and improving verification set accuracy.The second point is to compare and analyze the different text coverage of the input travel review text data,and we find that the threshold of text length can be selected based on the standard of complete coverage of 99%,on the premise of understanding the length distribution of each review in advance.This method can significantly improve modeling efficiency and better maintain classification accuracy.The third point is that we compare the three different recurrent network models.When the length of the Chinese text string is fixed to 73.Comparing the three different models of RNN,LSTM and GRU,we find that classification accuracys of The LSTM and GRU models are similar and pretty well.Considering the complexity of the algorithm,the GRU model is relatively superior in overall performance,and it is suitable for recurrent neural network models that process long text.For another sub-field of tourism data-hotel reviews.we can use the transfer learning model of deep learning in the sentiment classification problem,we use the embedding layer which learned from the attractions review data directly substitute the new model and fix it.At the same time,the other layers are evenly distributed or initialized to 0 value,after a short training,the model can also have good classification accuracy.This method avoids the large-scale data requirements of the target domain.Finally,we conduct information mining on tourism review data in Yunnan Province.We obtained descriptive statistics through the best and worst tourist attractions on the Internet.We have extracted some key information,which can offer the basis for specific scenic spot construction management and rectification,from huge comment data,through word cloud and DF-IDF method.
Keywords/Search Tags:Sentiment analysis, Deep learning, Model optimization, Transfer learning, Data mining
PDF Full Text Request
Related items