| With the rapid development of the global e-commerce platform industry,the problem of consumer complaints and loss is becoming more and more serious.For the platform,customer complaints will undoubtedly bring negative impact on the development of merchants.The high cost of churn is forcing platforms to pay more and more attention to churn.At the same time,under the limitation of the existing large platform resources,simple and rough marketing is no longer the trend.Today,sustainable customer development is about rationalizing resources to drive consumer spending.In this paper,starting from the customers of e-commerce consumption data as the initial object,the necessity of establishing the customer complaint warning model is analyzed,and the value characteristics of the improved random forest algorithm are illustrated.The main tasks include: First,comprehensively analyze the historical complaint data of customers according to their consumption and obtain the environmental factors that affect the complaints.The second is to capture the complaint mechanism attributes,fully consider the advantages and disadvantages of attributes under the premise of selecting the improved random forest optimization algorithm modeling,after the evaluation of the model concluded that the improved Bayesian algorithm and SMOTE improved RF algorithm in handling customer complaint early warning compared with the single model has stronger operational stability and generalization ability.Secondly,related attributes of loss data are separated.Based on the historical consumption records and attribute values of customers,26 attribute variables are constructed from the starting point that the transaction type is customer consumption date and time,purchase amount and other attribute types.The feature selection of basic features of customers was obtained by adding LASSO regression(L1 regularization term)and L2 regularization term.Logistic regression was used as the underlying function to improve the best effect of L1 and L2 in reducing the risk of overfitting,which could provide strong prediction and good parameter robustness in binary classification problems.First,based on the feature framework,the LASSO regression feature selection method was used to classify and screen various groups,and label values with poor prediction accuracy were eliminated.Logistic doping L2 regularization is used to establish the factors driving customer loss.The mining can be divided into three types of customers: low value customers,general value customers and important value customers.The training shows that the loss prediction model based on the characteristic framework of loss prediction customers is better constructed by the fusion algorithm of LASSO regression and Logistic regression.The prediction accuracy of important value customers after customer classification is higher than that before customer classification.The Accuracy value,Precision value and F1-score value increased by 6%,5% and 3% respectively.The true case rate reached nearly 100%,and the lost customers were rarely misjudged and could be ignored.Finally,based on XGBoost and random Forest feature fusion algorithm,the attribute values of common key factors affecting customer complaints and turnover are mined,and four types of factors are obtained: customer value and customer conversion rate,customer visits,historical order records,commodity price and evaluation.Put forward the corresponding customer retention suggestions,which are divided into three aspects: first,strengthen the communication with customers and do a good job in customer relationship management;The second is to establish customer portrait,pay attention to customer value,achieve precision marketing;Third,to improve the competitiveness of the brand,timely attention to the competition platform and market demand changes. |