| With the rise of the domestic real estate market and infrastructure projects in the past 20 years,the number of building decoration materials enterprises have sprung up like bamboo shoots after rain,and the market competition is severe.Especially in the early 1990 s,the first batch of foreign enterprises to enter the domestic market of building decoration materials are gradually eroded by the market share of local brands,and gradually lost the original brand advantage.How to gain a foothold in the current and leading market is essential for the grasp of business opportunities.Different from consumer goods enterprises based on B2 C mode,building decoration materials enterprises need to rely on B2 B mode to connect with various business opportunities based on engineering projects.For project-based business opportunity cases,the sales funnel model in CRM is usually used to manage the business opportunity process.If the systematic historical data can be used objectively to predict the winning rate at the initial stage of business opportunities,it can not only enable the sales staff to adjust the marketing plan based on the forecast results in advance,but also optimize and improve the supply chain planning process by using the forecast results,so as to achieve the ultimate goal of increasing revenue and profit.This paper is based on the historical data of CRM system of Company F,a foreignfunded building decoration material enterprise,combined with the popular machine learning algorithm models Random Forest,XGBoost,LightGBM and Support Vector Machine to do research,aiming to obtain a more suitable prediction model for Company F ’s winning rate of business opportunities.The experimental work in this paper mainly includes the following aspects:(1)Pre-processing of original data.Firstly,the factors that affect the winning rate of the business opportunity are analyzed,and the abnormal value and the missing value of the original data are cleaned.(2)Feature engineering design.On the basis of original related features,feature derivation is completed according to business experience.LightGBM is used to complete feature importance ranking,and then the top 20 feature variables are converted to the same dimension so that they can be used in later model training.(3)Model training and testing.The unbalanced training samples were treated with over-sampling method.The hyperparameter optimization of Random Forest,XGBoost,LightGBM and Support Vector Machine is realized by using the five-fold crossvalidation grid search and manual parameter tuning method respectively.Accuracy,Recall,AUC value and ROC curve of the binary model were used to compare and analyze the performance of the four training models under default parameters and optimal parameters respectively.Finally,the effect of prediction and stability of the four models are verified by using test samples.Thus,the following results can be obtained:(1)The original unbalanced samples processed by data cleaning and oversampling method have good prediction effect in the hyperparameter models of Random Forest、XGBoost、LightGBM and SVC.(2)On the basis of few original related features,the derived features obtained from our own experience rank high in the order of feature importance,indicating that the construction of feature engineering has achieved certain effects.(3)After the multi-dimensional comparison of the models,Random Forest and LightGBM model,as representatives of bagging and boosting algorithms respectively,can be considered as the two most suitable models for win rate prediction of business opportunity of Building decoration materials enterprise F Company.(4)Parameter optimization plays a significant role in improving the generalization ability of XGBoost,LightGBM and Support Vector Machine models.(5)SVC model is very sensitive to the fluctuation of small sample data and unbalanced data processing.In this research,training more than 10,000 pieces of data required a very powerful computer to reduce the time,and the oversampling method was superior to the undersampling method.(6)The most important top 5 features can be used as a basis for company F to develop S&OP strategies to improve the win rate.(7)CRM system data combined with machine algorithm can be used to optimize F company’s supply chain planning system process,so as to promote the three-party collaborative communication between demand planning,MRP planning and sales personnel. |