Font Size: a A A

Research And Implementation Of Telecom Customer Churn Prediction Model Based On Multi-method Fusion

Posted on:2021-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:C W ZhangFull Text:PDF
GTID:2428330611965696Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of the information construction,telecom operators have massive data resources,and it is of great significance to use data mining technology to build telecom customer segmentation models and customer churn prediction models.This paper analyzes the customer and business data of a telecommunications operator and a company in a city,builds a customer segmentation model and a customer churn prediction model,and then conducts customer retention strategy research based on the customer segmentation.The main contents of this article are:1.Based on telecommunications customer data,customer segmentation is conducted.This article addresses the problems of messy telecommunications data and large data volume,and performs data cleaning operations before the main work.On the one hand,through the data classification feature visualization,analyze the impact of different data features on the distribution of customer churn,on the other hand,formulate customer comprehensive value evaluation criteria,divide the customer comprehensive value into realized value,unrealized value and customer loyalty,combined with business logic Improve the K-means algorithm to subdivide customers into five customer groups.2.A variety of feature selection methods are used to compare the prediction effect of churn on different classifiers.Experiments show that the F-test method has the best effect.In addition,for the problem of imbalance of data categories,this article uses random oversampling,SMOTE method and ADASYN method three oversampling methods to compare the prediction effect of the decision tree and XGBoost model.Experiments show that the three oversampling methods can improve the prediction effect to a certain extent.It is better to use random oversampling in the decision tree,and the SMOTE method in the XGBoost model is better.3.Select the classifier with better effect(decision tree,random forest and XGBoost)as the base classifier for model fusion.The model fusion methods include Bagging and Stacking.In the Bagging experiment,the decision tree was used as the base classifier,in the Stacking experiment,the decision tree,random forest,and XGBoost were used as the base classifier,and logistic regression was used as the sub-classifier.The experimental results show that the Bagging integrated classifier can effectively improve the classification effect compared with a single decision tree;the recall of the Stacking integrated classifier can reach 85.66%,which is significantly better than the three base classifiers.4.First,text keyword mining is carried out on the customer's business recommendation memo information,and the business recommendation strategy characteristics are obtained by combining high-frequency business keywords.Then,by analyzing the comprehensive value of customers and the proportion of lost customers,the retention priority of the customer group is prioritized to reduce the retention cost.Finally,the business recommendation strategy is scored in conjunction with the distribution of customer churn,and several items under each customer group are selected as the final retention strategy for this customer group.
Keywords/Search Tags:telecommunications industry, customer subdivision, churn prediction, class imbalance, retention strategy
PDF Full Text Request
Related items