| With the continuous advancement of digitalization,a large number of companies are gradually adopting technologies like database,cloud storage and other platforms to accumulate huge chunks of customer data accompanied by accelerating of different kinds of automation processes.As big data technologies have been developed gradually,part of enterprises begin to mine useful information from customer data to guide their operations better.In such a competitive and fast-changing market,the products sold by the members of doing similar sorts of work should must be led to homogenization seriously.Customers become an important resource for enterprises to occupy more market shares.In such context,customers play a dominant role.And once enterprises cannot meet their customers’ needs or there are better alternatives,it is easy to lead to customers’ churn.This is more so for B2 B enterprises,because attracting new customers compared with retaining old customers will spend extra dollars.It is not difficult to conclude customer retention is particularly important for B2 B enterprises.As we can see that the benefits brought by attracting new customers are gradually decreasing,while retaining old customers can greatly benefit from it.Based on such background,this thesis takes X company as an example to establish a machine learning model for B2 B customer to predict their churn probability,dig out customers’ churn signals from existing customer information,and to some extent guide X company to improve customer retention strategies and its own products and services.Followed with the existing research conclusions,this thesis begins with the background from customer churn behavior,and then builds indicator system from point of customers’ basic attribute information,trading behaviors information,social networking features and emotion characteristics according to unique B2 B situation and X company’s business model.RPA tool was firstly used to collect data from X company’s data platforms and qichacha app,while Python was used to reorganize 603450 and 39755 records respectively about customers’ trading behaviors and return behaviors,finally obtaining data with a dimension of 1244×21.This thesis divides the cleaned sample data into training dataset and test dataset.For the training dataset,the RSF model is firstly established to determine the importance of each variable and the interaction between variables,getting the preliminary prediction results of the RSF model on the test dataset.Then Cox model is established based on the results of RSF model,finally getting the prediction results of the Cox-RSF model on the test dataset.The results show that Cox-RSF model is better than RSF model in Ratio and recall.In addition,compared with other comparison models,Cox-RSF model has more advantages in accuracy,precision,recall and Ratio.Finally,based on the prediction model established in this thesis,it is easy to draw the following conclusions.First,compared with SVM,BP neural network,LR and RF models,Cox-RSF model has the strongest ability to identify customers’ churn behaviors.Second,based on the estimation results of partial regression coefficients of Cox-RSF model,it can be seen that the hazard of customer churn decreases if purchase volume increase and return numbers and complaint times decrease and feedbacks are more agile.Third,the typical features of churned B-end customers in this thesis look like as that the consumption frequency decreases,the number of returns increases during the critical period of observation,and some customers are less dependent on X company’s products or services. |