Font Size: a A A

Research On Prediction Of Top-quality Tourism Service Formation Based On Ensemble Learning

Posted on:2020-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q HuFull Text:PDF
GTID:2428330596981797Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Due to the continuous improvement of residents'income,the service-oriented consumption industry is becoming more and more important in China,and tourism has become an indispensable part of people's daily life.With the increasing demand for diversified and non-standardized travel-related services,people's consuming behavior is becoming increasingly unpredictable.Reasonable prediction of consumer demand,analysis and mining of consumer preference,consumption ability and purchase behavior pattern will play a crucial role in the transformation from OTA to ITA.This paper takes the basic information of more than 40,000 OTA users,such as the personal data,behaviors,orders and comments,as the objects for data mining to build an ensemble learning model for predicting whether users will order quality tourism services in the shortterm.Through data analysis and feature engineering,this paper explores the method of constructing important features from the original data set.And through modeling analysis and comparative analysis,it builds an accurate and efficient prediction model for high-quality tourism services.Firstly,based on the analysis of the research at home and abroad,the principles and key technologies of LightGBM algorithm,CatBoost algorithm,RF algorithm and ET algorithm are introduced.Secondly,the original data is analyzed in detail from three aspects:data source,data structure and statistical analysis.The statistical analysis includes data volume analysis,field missing analysis,user area analysis,operation type analysis,historical order type analysis,score analysis and prediction target analysis.Then,the original data feature engineering is processed from four aspects,including data preprocessing,feature construction,feature extraction and feature selection.In the stage of feature construction,47 representation features,860 behavior features,141 state features and 21 benefit features are extracted.The comprehensive importance of features is calculated by using the ensemble learning algorithm,and the features are selected according to their importance.Then,based on the segmented training set and test set,the prediction effect and efficiency of five single models constructed by XGBoost,LightGBM,CatBoost,RF and ET algorithms are compared and analyzed.By introducing the diversity of features and algorithms and algorithm parameters,the hybrid model of learning and weighting method is constructed and analyzed,and three basic models,XGBoost,LightGBM and CatBoost are fused by adopting the weighting method.The last part is the comparation and analysis of the AUC value and total training time of prediction of single model and hybrid model.The results of this study show that the hybrid model is superior to the single model in predicting the performance of the model regardless of the combination strategy.In the aspect of combination strategy,the combination strategy of learning method is superior to the weighting method in prediction effect,but its efficiency is low.As far as a single model is concerned,CatBoost algorithm is slightly less efficient than other algorithms,but it has the best predicting outcomes.The hybrid model of ensemble learning based on weighting method improves the single model's prediction results and keeps its prediction efficiency in a reasonable range.
Keywords/Search Tags:Ensemble Learning, LightGBM, CatBoost, Feature Engineering, Model Fusion
PDF Full Text Request
Related items