Font Size: a A A

Review Texts Analysis Based On Machine Learning

Posted on:2020-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y DingFull Text:PDF
GTID:2428330572996913Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
People's material wealth has been greatly enriched.Going sightseeing has become a popular form of public pursuit of spiritual wealth.Before traveling,tourists can complete the hotel reservation on the online platforms.They will read the reviews provided by past tourists to get opinions and references.The hotels will also pay close attention to those reviews,from which they can know how to improve the service level.The information of reviews is messy and huge,so it is impossible to understand hotels' situation accurately and comprehensively through manual reading.It is of great significance and value to accurately categorize review texts and mine the information contained in them by using machine learning technology.Based on the theory of text classification and LDA topic model,this paper conducts an empirical study on the review texts of a brand hotel.First of all,the data is collected by using web crawler technology.Data preprocessing is accomplished by data cleaning,Chinese word segmentation and text vectorization.Statistical descriptions are given for the data that has been preprocessed.Then,four text classifier models,logistic regression model,support vector machine model,random forest model and artificial neural network model,are constructed.Their actual performance is evaluated by specific indicators such as recall,accuracy and the value of AUC.The support vector machine model performs best and can be widely used on the online platforms.It can accurately categorize review texts of that brand hotel,and make up for the shortcomings of some platforms that do not take classification into consideration or can not get great classification effect.Finally,LDA topic models are constructed in both positive and negative review categories respectively,to mine potential topics and extract feature words corresponding to different topics for comparative analysis.The focus of positive and negative reviews is mainly on geographical location,facilities,room size and sound insulation effect.Based on the potential information excavated from the review texts,it provides targeted comments and suggestions for tourists and that brand hotel.
Keywords/Search Tags:review texts, machine learning, text classification, LDA topic model
PDF Full Text Request
Related items