| Due to the large number of small and micro enterprises,imperfect information disclosure mechanism,and opaque operating status,it is difficult for relevant departments to make accurate assessments of their credit status,and it is difficult to support the healthy development of small and micro enterprises,and the health of small and micro enterprises Development invisibly affects my country’s economic development.With the goal of today’s economic development changing from high-speed development to high-quality development,the credit management of small and micro enterprises appears to be particularly important.This article uses the flow data of enterprise transactions to use machine learning methods to predict the credit status of small and micro enterprises.The specific work is summarized as follows:1.Data preprocessing.The data is processed with missing values and outliers,combined with the research target to construct characteristic variables,and the SMOTE algorithm is used to process unbalanced data to prepare data for modeling.2.Descriptive statistical analysis.Using charts to express the structural characteristics of the data can summarize the distribution of the data in a simple and clear way,and deepen the understanding of the data.3.Construct a logistic regression model for corporate credit prediction.Logistic regression is a logarithmic linear model,which is very common in all major fields.This article compares the prediction accuracy of the data before and after the SMOTE algorithm in the logistic regression model,and concludes that the SMOTE algorithm has a good effect on unbalanced data.Use cross-validation to verify the predictive performance of the model.4.Construct an algorithm model for corporate credit prediction.Use decision tree,random forest,support vector machine,GBDT several algorithms,optimize the model by adjusting parameters,use precision rate,recall rate,F1 value and AUC to evaluate the algorithm model,through comprehensive comparison,random forest model for small and micro enterprises The citation prediction effect of is relatively better,and the accuracy of prediction on the test set is about 94.83%.According to the characteristics of small and micro enterprises themselves,this paper analyzes the running data generated in the enterprise operation,uses the data mining methods to construct the characteristic variables and predict the credit status of the enterprise,and selects the random forest algorithm model in machine learning as the final credit prediction model.This study provides a reference for relevant institutions to predict and identify the credit status of small and micro enterprises. |