| With the continuous development of the Internet,the Internet is gradually showing the characteristics of information transmission,real-time communication,currency transactions,etc.,and humans have generated a lot of data on the Internet.Enterprises gain income by serving users,and users play an increasingly important role.At present,most enterprises pay more and more attention to using big data to provide users with accurate marketing services,and then tap the potential commercial value of customers.Using user consumption data to profile users can provide a theoretical basis for subsequent user group division,user-product association and recommendation,etc.Therefore,user portraits have extremely important application scenarios.Realizing user portraits based on users’ own characteristics and providing users with accurate services is worthy of in-depth study.On the other hand,with the disappearance of the domestic Internet demographic dividend,the cost of acquiring new users will become even higher.Enterprise development has shifted from the "incremental" customer model to the "inventory"customer model,and the loss rate of users is just like the outlet of the reservoir.A higher loss rate means that the outflow rate is faster than the inflow rate,and the water level in the reservoir is difficult to increase.Therefore,it is a key issue for enterprises to lean operations and reduce user churn.By regularly identifying users who are about to be lost,operations staff can retain these users in a targeted manner,which will greatly reduce labor and time costs.Taking timely retention measures to prevent customer churn has also become a very significant research issue.The research contents of this article are as follows:First,study the user portrait problem of UFIDA.For the UFIDA product space of UFIDA,based on the user’s order data for the past three years,the user attribute characteristics of the purchased product,region,industry,order year and month are descriptively analyzed,considering that the three-year overall attribute characteristics cannot be detailed Reflects whether the order is affected by the year.Therefore,in this paper,the characteristics of each attribute are cross-analyzed by year,and the K-means clustering analysis method is used to divide users into four categories based on order amount and number of purchasers,namely VIP user group,main user group,ordinary user group,and micro user group.In the past,user portraits were analyzed for individual users.In this paper,B-end users(enterprise)are used as the target users for analysis.The characteristics of the indicators change,and the characteristics of the company’s region,the industry category of the company,and the number of employees purchased by the company are shown.Secondly,according to the above four types of user groups,the basic attribute characteristics of users are associated with the order amount and the number of purchasers by using the Apriori association analysis method.VIP user groups,main user groups,ordinary user groups,and small and micro user groups.The four types of groups make portraits and analyze the characteristics of each user group to better provide users with detailed and accurate services.If UFIDA implements user portraits,it can provide guidance to front-line sales staff,which can greatly reduce time and money costs,can also achieve precise marketing,and quickly reach transactions,and can also apply the research results to service companies the company.Finally,build a customer churn prediction model.Aiming at the problem of imbalanced sample data,this article innovatively uses Python for Bootstrap resampling technology to bring the sample data closer to balance and expand the small sample.Combining the analysis of customer order data and behavior data of UFIDA’s Youspace product line,using Python to construct a customer churn prediction model using five algorithms,namely Logistic regression,decision tree,random forest,SVM,and KNN.Although the accuracy of SVM is high,the results of ten-fold cross-validation show that the training model is unstable.Sometimes the prediction results are all correct,and sometimes they are all wrong.Therefore,this model cannot be applied to practice.Finally,it is concluded that the classification effect of random forest is good,the model accuracy is about 81%,and the area under the ROC curve is 0.92.Therefore,the churn prediction model can be applied to practice,and targeted customers are about to be lost.This allows operators to formulate retention strategies,thereby greatly reducing time,labor and money costs. |