Font Size: a A A

Application Of Random Forest Method In The Customer Churn Prediction

Posted on:2009-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y H QiuFull Text:PDF
GTID:2189360272990232Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
It is becoming fiercer and fiercer that the competition of telecom enterprises in abstracting customers and expanding market, with the growing popularity of communication terminals. According to the latest cost accounting structure of telecom industry, the cost of losing an existing customer is 5 times as much as the profit that a new customer can bring about. Therefore, customer churn prediction becomes the most important task in such an increasingly saturated market.The data of telecom industries are always numerous and increasing timely. As Random Forest method can deal with huge data sets effectively and have a good performance of noise tolerance, we introduce this method into the construction of the churn prediction model in this paper, for the regional branch of Fujian Mobile.Firstly, we build a raw churn prediction model using RF method. During the data processing, we utilize the detection method provided by RF method to detect the abnormal samples. Compared to the other commonly used algorithms, the method based on the RF algorithm is proved to be more effective and less time consuming. After picking the abnormal ones out of the whole customer samples, we build a random forest to predict the churn possibility of customers. Compared to other existing models, the RF model turn out to be more accurate.Furthermore, with the proximity matrix of samples provided by RF, we can obtain the scaling coordinates of each sample through this characteristic mapping. Combined with transduction inference, this thesis proposes a projecting method based on the transduction inference and coordinates scaling under the framework of RF. Experiments demonstrate the effectiveness and simpleness of the proposed method for dimension reduction, and also indicate that it is capable of modeling information of samples.Furthermore, we combine a super-ellipsoid K-means clustering algorithm, which is based on the Mahalanobis distance, with the above work (short for HCkmean-in-RF), to improve the generalization-error of customer churn prediction model. Experiments show that the improved model is proved to get a better accuracy and explanatory. According to the analyses of prediction results of advanced model, we give different suggestions in terms of different kinds of customers.Therefore, it is expected that the proposed advanced churn prediction model will be applied as a strong candidate to customer churn prediction in telecom industries.
Keywords/Search Tags:Customer Churn Prediction, Random Forests (RF), Transduction
PDF Full Text Request
Related items