Font Size: a A A

Research On Risk Early Warning Model Of Customers Loss In Telecommunication Based On Combined Forecasting Algorithm

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:K SunFull Text:PDF
GTID:2428330623465497Subject:statistics
Abstract/Summary:PDF Full Text Request
In the new stage,the strategic goals of telecommunications companies have shifted from new customer development to old customer retention.In the increasingly mature telecommunications market,the development of new customers is becoming increasingly difficult and requires a lot of human and financial resources.At first,for a long time,operators spent a lot of energy on opening up new markets and developing new customers,but they did not pay enough attention to customers.Exploiting new markets is bound to cost a lot of operating costs,which has caused a certain degree of false increase in customers.Studies have shown that developing new customers costs more operating costs than retaining existing customers,and brings less revenue to the business.An old customer will tell his neighbors about 2-3people 's satisfaction and 8-10 people 's dissatisfaction,which will greatly affect the reputation of the company.Based on the behavioral data of broadband customers in a certain city,this paper builds an early warning model of whether to pay the bills based on the behavioral data of the enterprise's broadband customers.The time window is selected as five months,of which the first three months are the analysis window for model building,the fourth month is the retention window for the enterprise to take retention measures,and the fifth month is the forecast window,that is,whether the customer is lost Achieve bi-monthly forecasts.The main goal of this article is to build a combined model to predict the churn of customers every other month.The combination model is a classification algorithm based on the linear combination of the four sub-models.It can effectively take advantage of each sub-model,improve the classification ability of the model,and help the company to retain customers,which is of great significance to improving corporate revenue.The selection of data has the characteristics of severe data imbalance and fewer attributes.After processing the data with missing values and outliers,the feature engineering of the data is started,including feature derivation and feature selection.Effective feature derivation can improve the accuracy of the model;feature selection can be done on the premise that it does not affect the model or has less influence.Improve the speed of the model.In order to solve the problem of severe tilt of the data,this paper samples the data from two directions,randomly undersamples the majority of samples,and then oversamples the minority samples by SMOTE.10: 1,repeat the above process 4 times,and use the obtained four data sets for training a base classifier.Among them,the base classifier chooses logistic regression,support vector machine,neural network and XGBoost,which are quite different.In this way,the problem of data imbalance can be effectively solved and the performance of the model can be improved.Four trained base classifiers are used to predict the test set,and the prediction results are linearly combined.The prediction result of the combination model is obtained.The key of the combination model is the solution of the coefficients.In order to build a better combination model loss function,this paper introduces the first type of classification.The error rate is used as the penalty term of the coefficient,and the Lagrange multiplier method is used to add the constraints of the coefficient to the loss function,and the coefficients of the model are solved by minimizing the loss function.In order to prove the effectiveness of the model in this paper,a majority voting model was constructed on the basis of the base classifier,and the results of each model were analyzed and compared through evaluation methods such as accuracy rate,recall rate,F1 value,and AUC value.The experimental results show that the F1 value of the support vector machine and the class I classification error rate are optimal,and the logistic regression class II classification error is the best.XGBoost has a higher recall rate for a small number of samples.The neural network The AUC value is the largest.It is found that the performance of the base classifier is unstable.Compared with the base classifier,the majority voting method found better results on most indicators,especially in the recall rate of minority samples,but it showed better results in AUC,minority predictionaccuracy,and F1 value.Not good,it turns out that the traditional voting model does not perform well enough.The results of the combined model constructed in this paper show that each index has a lot of improvement over other base classifiers and most voting models.Among them,the error rates of class I and class II classifications are higher than those of other optimal models.Reduced by 0.05%,the model accuracy rate was improved by 0.32% compared with other optimal models,and the model accuracy,recall rate,and F1 for a few samples were increased by 2.05%,3.32%,and4.63% compared to other optimal models.The model AUC The value also increased by 0.006.The experiments show that the combination model constructed in this paper is effective,can effectively improve the classification performance,and has practical significance.In addition,the model constructed in this paper can be improved in many ways.The first is the derivation of features,which can build features that are more relevant to the objective function.The choice of the base classifier in this paper is a traditional model.If you choose an improved model as a subclassifier,the classification of the combined model may be improved.Ability;the model combination method in this article is linear.If you use a non-linear combination method,the prediction accuracy of the model may be effectively improved.
Keywords/Search Tags:Customer Churn Prediction, Combined Model, Unbalanced Data, Lagrange Multiplier
PDF Full Text Request
Related items