Font Size: a A A

Research On Customer Churn Prediction Algorithm Based On Multi-layer Perceptron And User Clustering

Posted on:2021-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q TangFull Text:PDF
GTID:2428330629953114Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of economy,the competition among enterprises is becoming increasingly fierce.The cost of attracting a new customer is getting higher in the context of business diversification,market saturation,and economic globalization.To prevent the churn of customers can effectively improve the profits of enterprises.As an indispensable part of the customer retention strategy,how to accurately identify those customers with a high probability of churn is of vital importance.Nowadays,a growing number of enterprises have realized the significance of retaining existing customers.To develop the early warning capability becomes the focus of enterprise development.Through the churn prediction system,the enterprise can formulate relevant strategies to improve customer satisfaction and prevent customer churn.However,due to the high complexity,high redundancy,and high dimension of customer historical features,how to construct a prediction model has become an important task in the field of data mining.In this thesis,two churn prediction algorithms based on multi-layer perceptron and one algorithm based on customer clustering are proposed.The main research works are as follows:(1)A multi-layer perceptron prediction algorithm based on stacked auto-encoder is proposed.The number of discrete features in customer historical data is always huge,such as “Gender”,“Occupation” and “Nationality”.The prediction model cannot deal with these features directly.Only by being transformed into binary vectors via one-hot encoding can they are imported into the prediction model.However,this method has two disadvantages: 1)it can produce massive redundancy information;2)it can increase the dimension of features greatly.In view of these two disadvantages,the stacked auto-encoder is used to compress the one-hot vector firstly.The nonlinear transformation of the encoder can generate the implicit fusion feature vector.It not only eliminates the redundancy information but also reduces the feature dimension.Then the continuous feature vector and fusion feature are connected.The cross-entropy function is constructed by a multi-layer perceptron.Finally,the multi-layer perceptron is trained with the stacked auto-encoder by the ADAM optimization algorithm.Compared with many prediction algorithms,this algorithm has a better prediction performance on public datasets.(2)A multi-layer perceptron prediction algorithm based on entity embedding and factorization machine is proposed.The feature fusion network is applied because the information of customer features is redundant and the traditional multi-layer perceptron cannot generate feature interaction vectors.First of all,in order to eliminate the problems caused by one-hot encoding,entity embedding is used to process the one-hot vectors formed by discrete features.And the generated embedding vectors are concatenated,each of which is a low-dimensional representation of original discrete feature.Then several sliding windows are used to scan the original feature vector.At the same time,the factorization machine is applied to generate high-order feature vectors.In the experiments,the order of feature vectors is controlled by changing the high-order terms of polynomial regression.Finally,the continuous feature vector,embedding vector,and high-order feature vector are concatenated to be trained iteratively by the multi-layer perceptron.And the cross-entropy function is optimized by using the ADAM algorithm.The experimental results show that the predictive performance of this algorithm is better than other algorithms on public datasets.(3)A hybrid prediction algorithm based on customer clustering is proposed.This algorithm is based on the characteristic that customers in the same group always have similar traits,behavioral preferences and focuses.And it can be divided into three stages.In the first stage,the multi-layer perceptron is applied to train a prediction model firstly.And then the new feature vector is generated to replace original vector based on the nonlinear representation capability of neural networks.Because the original customer features are both complex and redundant.In the second stage,the k-means algorithm is applied to perform single-feature clustering firstly.And then the clustering center is used to replace the original features.At last,customer clustering is performed.All the number of categories is determined by the silhouette coefficient.In the third stage,different prediction models based on GBDT are constructed according to the characteristics of different customer groups.The experiment results on public datasets show that this framework effectively improves the predictive performance of the original GBDT and is better than many prediction algorithms.
Keywords/Search Tags:Customer churn prediction, Stacked auto-encoder, Entity embedding, Factorization machine, Customer clustering
PDF Full Text Request
Related items