Font Size: a A A

Demonstration Study Of Customer Churn Prediction Based On Data Mining

Posted on:2010-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:X F SiFull Text:PDF
GTID:2189360275951248Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the real world, data distribution is often class-imbalanced.The un-balanced data problem has already affected many applications for example: customer churn, fraud detection, risk management and so on. Now, with in-depth study of data mining, non-equilibrium data mining is becoming a hot new field of research.In the thesis, the customer churn data sets are typical of non-equilibrium data. And in the thesis the customer is enterprise customer of web recruit sits.The industry of global web recruitment is developing so rapidly. It was reported that about the world's 20 million daily employment information released and More than 3000 million people on the Internet issued their Resumes and in 2006 the global recruitment market reached 17.2 billion dollar. In China, the web recruitment market size reached 0.97 billion RMB in 2007, 1.25 billion RBM in 2008 and expected in 2009 will reach 1.61 billion RBM. For web recruitment huge market size, good prospects of highly profit, lots of new specialization, industry, local recruitment web sites was born and meanwhile increased the web recruitment of industry competition.For the problems of customer churn, in the telecommunications industry, banking, insurance, building customer churn prediction based on data mining technology is good choose and achieved fruitful research results. However, the study of churn problems for enterprise is the initial stage in web recruitment industry. In the thesis, we have a depth study and research on the non-equilibrium data mining problems. The customer churn theory, research methods and the development of context were reviewed and summarized. And to China's web recruitment industry characteristics, market size and growth prospects were also analyzed and discussed. Support Vector Machine as a popular data mining techniques and becomes a research hotspot in recent years for its solid theoretical foundation and the promotion of good performance were introduced and systematic exposition.on the basis of to the problem of customer churn and retention strategy, we have a demonstration study based on data mining through collecting a well known domestic web recruitment site enterprise customers'characteristics data and their online behavior log data.In the thesis, the results of research are:Customer churn data sets have typical non-equilibrium characteristic and differences in the cost of misclassification. In traditional SVM based on the Cost Sensitive Learning put forward a Cost Sensitive SVM customer churn prediction modeling, experimental verification of the validity of the modeling to solve such problems on a certain reference.To against the problem of customer churn data sets'High -Dimensional characteristics , put forward a principal component analysis and neural network prediction modeling and through empirical research results show that the combination of ways to reduce high-dimensional attributes, simplifying the neural network topology and improving the performance of the model predictions.For the issue of retention enterprise customer, the thesis discusses the retention strategy. In addition, customer online behavior is analyzed by K-means clustering technology.
Keywords/Search Tags:data mining, customer churn prediction, un-balanced data, cost sensitive learning, support vector machine
PDF Full Text Request
Related items