| In the era of rapid development of the Internet,more and more users express their remarks and opinions on e-commerce platforms,resulting in a large amount of user-generated content,and the same is true for travel service platforms.The comments generated by users on the travel service platform hide the real views and emotions of users.If the valuable information can be analyzed,it will be of great significance to consumers,businesses,platforms and industries.To this end,this thesis conducts the research on sentiment classification of travel review texts.Sentiment analysis of comment text is to extract effective information from the deep semantics of comment text.Existing research mainly implements sentiment analysis of comment text from two aspects: coarse-grained and fine-grained.The purpose of this thesis is to take the tourism review text in the actual scene as the research object,to carry out coarsegrained and fine-grained sentiment classification research on the tourism review text,and to design and implement a prototype system of sentiment classification for tourism review text based on the research results of this thesis to help consumers to make decisions.The specific research contents are as follows:(1)This thesis introduces the background and significance of the subject,analyzes the research status of sentiment analysis technology and tourism comment text sentiment analysis technology at home and abroad,and expounds the relevant theoretical and technical foundations involved in this thesis.(2)The thesis uses Python crawlers to crawl a large number of review text data of tourist attractions from the Ctrip mobile website,and cleans and labels the data according to the research content of this thesis,and constructs the date set for the coarse-grained and finegrained sentiment classification of tourist review texts in actual scenarios.(3)In view of the fact that the current coarse-grained sentiment classification models for travel review texts usually use static word vectors such as Word2 Vec to vectorize the text,which cannot solve the problem of polysemy,and downstream models usually using recurrent neural network or convolutional neural network alone has the problems of insufficient feature mining and low model accuracy.This thesis proposes a two-channel coarse-grained sentiment classification model based on BERT,i.e.,DCM-BERT.First,the BERT model is used to embed words in the text,and then the word vectors are sent to the bidirectional GRU and TextCNN networks to extract the sequence features and local features of the text,and then the sequence features and local features are fully combined to generate the final representation of the comment text.Finally,the final representation of the comment text is used to implement sentiment classification.The results of the experiment show that the model proposed in this thesis has higher classification accuracy.(4)Aiming at the problem that the existing fine-grained sentiment classification models do not make full use of text location information and text dependency syntax,this thesis proposes a fine-grained sentiment classification model based on BERT,i.e.,BERT-FGSCM.Firstly,the BERT model is used to embed words into the text to obtain the local and global semantic representation of the text;then,the word vector is weighted according to the position information of the text,and text dependency syntactic relations is fully mined through the graph convolution network to enhance the expression ability of the model;finally,the multihead attention mechanism is used to filter the features to obtain the final representation of the text for classification.When the word vector is position weighted in BERT-FGSCM,aiming at the problem that the existing models do not treat text other than the aspect word as a whole,resulting in semantic fragmentation,the method of position weighting is improved and ablation experiments are carried out.The results of ablation experiments show that the position weighting method proposed in this thesis can make better use of the position information of the text and improve the accuracy of model classification.In addition,ablation experiments are also carried out on the model before and after the introduction of text dependency syntax.The results of the experiment show that the classification accuracy of the model after the introduction of text dependency syntactic relations is improved by 2.19%.The overall experimental results show that BERT-FGSCM model proposed in this thesis can effectively improve the classification accuracy.(5)Based on the coarse-grained sentiment classification model proposed in this thesis,a sentiment classification prototype system is designed and implemented. |