Font Size: a A A

Customer Churn Prediction Based On Xgboost And Logistics Hybrid Algorithm

Posted on:2021-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:X L FanFull Text:PDF
GTID:2428330626461128Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the advent of the 5G era and the transformation of the telecommunications market,competition in the telecommunications industry is becoming more and more fierce.Therefore,telecommunications companies have launched customer-centric service strategies to retain old customers and win new customers,thereby achieving the goal of maximizing profits.This article first summarizes the background,significance,and status of research at home and abroad.Then it introduces the main ideas of customer relationship management and the value creation of customer relationship management.From the perspective of data and algorithms,the main classification algorithms used in current customer churn prediction are studied,including random forests,support vector machines,artificial neural networks,and so on.In Chapter 4,a visual exploratory analysis is performed on the customer churn data used in this article.The customers are classified into churn and non-churn categories.The differences and changes in different customer categories are explored by exploring different variables.For attribute changes,we study the distribution histogram of different variables for different customers,and then combine the sample proportion of each attribute variable in different customer types.Based on the meaning of the variables in practice,analyze the impact of variables on customer churn.For numerical variables,according to the distribution histogram,the probability density curves of the variables at different customers are fitted for comparative research.Through exploratory analysis,we screen out the features that do not have a significant impact on customer churn,and then combine the variable importance indicators and variable correlation coefficients to determine the 15 features that ultimately enter the model.Then,according to the modeling ideas of feature extraction,data encoding and classification prediction,the prediction of customer churn classification is realized.In order to improve the prediction accuracy and operating efficiency of the model,the advantages of Xgboost parallel computing,processing interaction,and thegood performance of Logistics regression in linear regression prediction are combined,and a mixed model of Xgboost and Logistics is proposed.Then,in order to study the prediction performance of the fusion algorithm,the fusion algorithm was compared with the random forest,Xgboost,and Logistics algorithms.By comparing the ROC curve,accuracy,precision,recall,and AUC values,it was found that the fusion algorithm based on Xgboost and Logistics was used.All the indicators of the model constructed by the classification algorithm have the highest scores,which are significantly better than its single classification algorithm and general ensemble learning,where the AUC value reaches 0.94.This proves that the new model is effective for customer churn management.
Keywords/Search Tags:Customer churn, Feature selection, Classification, Fusion mode
PDF Full Text Request
Related items