Font Size: a A A

Algorithm Research On Customer Churn Prediction

Posted on:2016-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2308330464952611Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer network and information technology, the era of big data is coming. Big data has the characteristics of high dimensional, nonlinear, unbalanced, and even the uncertainty. Extracting valuable information from the mass data is a difficult problem. For the development of enterprise, large data processing has important significance, because "customer data" contains important information. Customer churn prediction analysis is the processing of a large number of customer data classifications, and the characteristics of big data will affect the accuracy of customer churn prediction. By modeling the customers a large number of historical data, then carries on the analysis to classify future customer data. Data classification is a hot topic of research data. Classification and prediction mainly based on statistics and artificial intelligence in the last. Support vector machine is not sensitive to mass data, so it is more suitable for the classification of big data processing. With the K-NearestNeighbor algorithm, it processes the second time and make the classification results are improved. Selective ensemble learning is selecting the training classifier based on some criteria. And then they are combined to form a new model. In this paper, we research the problems of the customer churn classification based the two mentioned methods. The main contents are as follows:1. Using support vector machine and K-NearestNeighbor algorithm to research the classification and prediction of customer. After pretreatment, in the massive customer data, the number of customer churn in the customer data has a very small proportion, and it is a typical the imbalanced data classification. In order to improve the accuracy of the imbalanced data in classifying loss of customers, first of all, the Manhattan distance of the positive and negative data should be computed. By adjusting the weights of the two classes, eliminate the imbalanced data classification which brings the deviation. Modeling of the training data by using the improved support vector machine, and the future date test set is classified the first time through the model. Then to the partial data, use the K-NearestNeighbor to classify the second time, and it rectifies the individual error identification data. Validated with real data of a telecom, experimental results show that it is a relatively good model.2. For the limitations of data classification of single classification model, the selective ensemble learning is used to classify the customer churn data. First, bayes, decision tree, neural network and support vector machine are selected as the based classifier. Through training multiple classifiers on the cycle data set, the accuracy of each classifier can be computed. In this paper classifiers are selected based the level of accuracy to make to last model having the better result. And then take Gauss to weight the based classifier. Through the experiment on real data, selective ensemble learning with Gauss weighted obtains the better results on classification accuracy and lifting parameters.In this paper, we present two ways for the classification of customer churn prediction. For the unbalanced data, the improved support vector machine classification is given for deviation correction. And then use the K-NearestNeighbor to classify the partial data. By processing the two times it can get better classification results. A single classifier model has some limitations for data classification. Through the selective ensemble learning, we can combine the excellent performance classifier to achieve complementary target. The selective ensemble learning with Gauss weighted gets better results in the real data. The mentioned two methods of customer churn prediction can provide reference for customer relationship management.
Keywords/Search Tags:Customer churn prediction, Support vector machine, KNN algorithm, Selective ensemble learning
PDF Full Text Request
Related items