Font Size: a A A

The Application Of Ensemble Learning In The Early Warning Model Of Operator User Churn

Posted on:2022-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y T TanFull Text:PDF
GTID:2518306722481934Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Domestic telecom operators are facing severe situation now.On one hand,telecom market is becoming highly saturated and insufficient innovation ability causes serious product homogeneity,making it increasingly difficult to develop a new user.On the other hand,the development of the Internet industry has expanded the channels for people to obtain information and communicate,and inevitably give an impact on the traditional telecom industry.Telecom operators have gradually realized the importance of user retention.With the help of data-mining technology,establishing a churn warning model to predict the churn tendency of customers and develop retention measures has become an important way for telecom operators to reduce churn rate.This paper mainly discusses how to apply Ensemble Learning to churn prediction of telecom customers.Based on the data disclosed in a data-mining competition,The main work is to predict the potential churn tendency of telecom customers in advance with Ensemble Learning.First of all,this paper explains the necessity of establishing a model to predict churn tendency of telecom customers by analyzing the current situation faced by domestic telecom operators,and derives the values and prospects of applying Ensemble learning to telecom churn prediction based on the analysis of advantages of Ensemble learning,then the current status of related research is drawn.Secondly,a detailed introduction to the theory of Ensemble learning and related machine learning algorithm are listed,which provides theoretical support for the research.Next,a series of preprocessing steps were performed on the collected samples to complete the data preparation stage of modeling.In the model building stage,this paper proposes classification models for telecom churn prediction based on three Ensemble learning ideas:Bagging,Boosting and Stacking.Random forest is selected as the representative algorithm of Bagging,while GBDT and XGBoost are used as the representative algorithm of Boosting for model training and parameter tuning.Finally,the above three learners are used as the primary learners in Stacking while Logistic Regression is used as the meta-learner in Stacking,then an telecom churn prediction model based on stacking is established.In addition,based on the misclassification rate of churned and non-churned samples,primary learners are assigned corresponding weights to establish weighted-stacking churn prediction models,all above models are compared and evaluated based on the selected model evaluation indicators.Finally,this article puts forward several supplements and improvements on the application scenarios of the model in reality.According to the results of the research,XGBoost performs better than other two learners on AUC and1 on test set and has the best classification performance among all single learners.Furthermore,the classification performance of the model after stacking outperforms all single learners on AUC,1 and Precision,which proves stacking has improved the performance of classification.In addition,according to the misclassification rate of the positive samples by the single classifier,model based on the weighted stacking of the churn samples is established and it performs better than unweighted Stacking on Recall score and1.The results of this research can provide a certain degree of enlightenment for telecom operators'user retention.A superior model can comprehensively and precisely identify potential lost users,helping operators obtain greater benefits at smaller costs on reducing the churn rate.
Keywords/Search Tags:Ensemble Learning, Random Forest, GBDT, XGBoost, Stacking, Churn prediction
PDF Full Text Request
Related items