The discrimination of default risk of small enterprises is to establish a model to discriminate whether small enterprises are in default or not through the functional relationship between the data of financial,non-financial and external macro variables of small enterprises and the default status.The result of default discrimination can provide decision support for banks,investors and regulators.It is essential of small enterprises in the national economy.In order to identify small business credit risk,reduce bad debt losses of default,avoid inaccurate result of feature combination caused by imperfect information disclosure,it is essential to set up an accurate default discriminate model reflected the default risk.This paper studies the construction of small enterprise default discrimination model based on optimal features transformation.The emphases of this study include:first,the optimal features selection.Choosing different combination of indicators to judge the default of the same enterprise,the discriminant results will be very different,or even completely opposite.Therefore,there must be an optimal combination of features so that the default discrimination model has the highest discrimination accuracy.The second is the transformation of features value.Using original data modeling and data modeling after numerical transformation,two different default discrimination models can be obtained,and the results are bound to be different when using them to discriminate the same enterprise.Using different transformation methods for the same set of data,the model accuracy is quite different.Therefore,it is necessary to carry out a reasonable numerical transformation approach on the original feature data to improve the accuracy of default discrimination.The first innovation of this paper is optimal feature selection.We substitute all n features into the Linear-SVM to get the weight of each indicator of the model.And selecting the absolute value of the i-th weight as the threshold.Then we select the features corresponding to the weight greater than or equal to the threshold to build the i-th features combination.Therefore,n features combinations can be constructed.In n feature combinations,we maximize the F-score to reversely deduce an optimal feature combination,which solves the subjective problem of threshold selection and feature weight determination separating from default discrimination model.The second innovation is the transformation of features.The optimal weight w*of the last training output of 1 000 samples of the deep neural network is weighted to the sample data xij to obtain the transformed modeling sample w*i*xij,which improves the accuracy of the default discrimination model.Thirdly,empirical results show that the default discrimination accuracy of the model in this study is higher than LSTM,CNN and other three deep learning models,higher than random forest,GBDT and other five machine learning models,higher than logistic regression and LDA statistical learning models.The study found that compared with financial indicators,non-financial indicators can better reflect the default tendency of small enterprises,and the importance of macro indicators can’t be underestimated.Among non-financial indicators,“patent status”is the most important,contributing 11.81%of the importance,ranking first.Among the macro indicators,“industry prosperity coefficient”is the most important,contributing 10.57%of the importance,ranking the second in the whole features combination.Among the financial indicators,“total asset turnover speed”is the most important,contributing 9.63%of the importance,ranking the third. |