Font Size: a A A

A Research On Bagging Of XGBoost Classifiers For Prediction Churn In Telecom

Posted on:2020-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:S Q XuFull Text:PDF
GTID:2428330590960700Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of mobile communication industry,operators pay more attention to customer management.According to relevant research,the cost of acquire a new customer is six times that of retaining an old customer.Therefore,it is important for operators to predict the churn of customers.Nowadays,operators have accumulated a large amount of user information and behavioral information.It is easier to predict customer churn with large data of telecommunications.However,problems remain to be solved: the large volume of data and the consistency of data integrity caused by heterogeneous data sources;the high dimension of Telecom user data;and the imbalance between the number of churn customers and non-churn customers.This brings difficulties to the research of customer churn prediction.In this research,to solve the problem of telecom customer churn prediction,the following solutions are raised: integrating telecom operation data from heterogeneous data sources through Apache Hadoop and Spark's distributed data platform;mining the hidden feature information under the large data of telecom users using graph theory,natural language processing,stacked autoencoder and other methods;and comparing several kinds of sampling for data imbalance.The method of data set sampling and the method of mixing different sampling modes are presented.A esembled model based on XGBoost and Bagging method combined with mixed sampling method is proposed,which makes full use of the unbalanced data set and constructs a two-classification model to predict the loss of telecom users.Finally,in the experiment,through the commonly used model evaluation index accuracy rate,recall rate,AUC and so on,and combined with the expected profit of customer retention activities to verify the effectiveness of the model.
Keywords/Search Tags:telecom industry, churn prediction, autoencoder, XGBoost, data imbalanced
PDF Full Text Request
Related items