Font Size: a A A

Research On Customer Churn Prediction Based On Algorithm Fusion

Posted on:2019-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:T T ZhaoFull Text:PDF
GTID:2439330572461438Subject:Statistics
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of economy,commercial competition has become more and more intense.Under the background of increasing competitors,business diversification and economic globalization,more and more enterprises gradually realize that to grasp customers is to grasp the market,and thus occupy a dominant position in the competition.Customer is the key to the long-term development of an enterprise.It is of great significance for the survival and development of an enterprise to effectively predict the customer churm and retain them.The analysis of customer churn has been deeply studied in communications,finance and other industries.The establishment of customer churn prediction model has also brought profit growth to relevant enterprises,but the analysis of customer churn of airlines has not been paid much attention to.The quality of service,competition in the industry and other modes of transportation will result in customer churn.Therefore,how to judge the loss of customers,and then improve customer satisfaction and loyalty is the key for airlines to improve the loss,retain their market share and face the fierce competition.It is especially critical to help enterprises establish an appropriate and effective customer churn prediction model and help airlines to target customers with high loss tendency and minimize the loss.The main goal of this paper is to build a fusion model to predict the loss of customers.The fusion model is based on five commonly used classification algorithms,which integrates the advantages of these algorithms,improves the accuracy of the prediction,and is of great significance to help enterprises accurately judge the loss of customers and save the cost of enterprises.In this paper,the customer churn data of the airline company is adopted.Since the data used is non-balanced data,after the basic processing of missing value and abnormal value cleaning of customer churn data,the combination of upper sampling and lower sampling is adopted to balance the sample distribution of the data,so that the classification model has better implementation effect.Because there are many customer churn data variables and the application premise of each single algorithm is not identical,this paper uses Information Value and correlation between variables to filter the input variables,so that the input variables of each algorithm are consistent,and then establishes the Discriminant Analysis,Logistic Regression,Decision Tree,Artificial Neural Network and Support Vector Machine model respectively.At the same time,the results of the above five models are utilized,and class I error constructor is introduced,and the optimal weight of the fusion model is obtained by using Lagrange Multiplier method.Finally,the fusion model is constructed by using the results of the above models,and the customer churn data is predicted by using the fusion model.The model was evaluated with prediction accuracy,minimum misclassification error and AUC value,and the prediction results of several methods in customer churn data were analyzed and compared.By comparing the model results,confirmed in the DA,LR,DT,ANN and SVM model,the DA model accuracy,but the first type I error minimum,SVM model of the first type I error times,only about 1%larger than DA,but its accuracy and the AUC value is highest.LR,DT,ANN model of the first class II error results are good,more than 3%lower than the SVM,low accuracy compared with SVM is only 2%to 3%,but their first type I error than the SVM twice as much,as in predicting erosion problems,in addition to the accuracy,the first type I error is relatively important,comprehensive,the result is the most optimal SVM model.Weighted fusion model using the above five kinds of model prediction score,combined with the advantages of each algorithm,compared with a single algorithm,the first class I and class II errors are failing to meet minimum,but the two types of errors are smaller and similar,fusion model with the highest accuracy and AUC values,that both classification accuracy and prediction accuracy is higher,smaller misclassification rate,it is a kind of accurate and effective method.In this paper,the class I error is introduced into the constructor creatively,and the Lagrangian Multiplier method is used to obtain the optimal weight of the fusion model.Meanwhile,the optimal weight and prediction score of the five models are used to construct the fusion model.The results show that the fusion model is better than the single algorithm and is a feasible method.In addition,the customer churn prediction model can be improved in many aspects.The first is the improvement of the modeling method.The fusion model constructed in this paper is based on the five most commonly used classification algorithms.Although the accuracy and misclassification of the model are improved compared with the single algorithm,if the fusion model can be constructed based on some improved classification algorithms,more accurate results will be obtained.Some algorithms can be replaced by Random Forest with higher accuracy,Gradient Boosted Decision Tree,etc.However,it is necessary to note that there should be some differences among the algorithms when replacing the algorithms.Secondly,with the development of big data,the customer churn data is not only limited to the internal data of the airline,but also can involve the Internet and the behavior data of the third-party booking app,which is helpful for airlines to judge customer behavior from various aspects and make more reasonable and comprehensive predictions.This is the future research direction of customer churn prediction model.
Keywords/Search Tags:Customer Churn Prediction, Binary Classification Algorithm, Fusion Model, Lagrange Multiplier
PDF Full Text Request
Related items