As the economic environment keeps changing,listed companies are facing many risks and challenges.In order to guarantee the healthy development of enterprises and reduce the risks for investors and managers,it is very important to choose the financial crisis prior-warning model in a scientific and reasonable way.The financial crisis prior-warning model enables the management of listed companies to predict the risks in advance and take timely countermeasures to prevent the danger of the companys financial operation,thus reducing or avoiding the losses brought by the financial crisis to the management,investors,government and even the whole society.Effective financial crisis prior-warning plays a very important role in enhancing the ability of enterprises to resist risks,and can improve the management system of enterprises to a certain extent to ensure the healthy and sustainable development of enterprises.In this paper,through the CSMAR,a total of 428 listed companies that were treated as special for the first time from 2016 to 2021 were selected,and financially normal listed companies were selected according to the principle of the same accounting period,the same industry,and close asset size as well as the matching ratio of 1:1,and the indicator data of these companies for T-3 years were selected.Based on the research on financial crisis prior-warning by domestic and foreign scholars,94 financial and non-financial indicators were initially selected,among which financial indicators include five aspects of solvency,profitability,development capability,operation capability,cash capacity,and non-financial indicators include three aspects of equity structure,enterprise structure size and audit opinion.In this paper,we first conduct exploratory analysis on the dataset,mainly through the visualization of kernel density plot,bar chart and heat map of correlation coefficient,and initially explore the relationship between the characteristic variables and the target variables.Considering the specificity of the data,pre-processing such as missing value filling and standardization is performed in the early stage to improve the data quality,facilitate the subsequent modeling study,and improve the prediction effect.Before dimensionality reduction,77 indicator variables that can significantly distinguish corporate financial crisis from corporate financial normal were first screened using significance tests.In order to further reduce data redundancy and indicator overlap,two nonlinear dimensionality reduction methods,AutoEncoder and kernel principal component analysis,are selected to reduce the data indicators to 10,20,...,and 60 dimensions,where the selection of activation functions in the AutoEncoder under different dimensions,the best activation function is selected by the accuracy of feature extraction,and then combined with random forest,XGBoost model and support vector machine model to construct the financial crisis prior-warning model,and to improve the prediction effect of the model,the optimal parameters are adjusted by grid search method and 10-fold cross validation.Regarding the two nonlinear dimensionality reduction methods,the effect of combining classification models after AutoEncoder dimensionality reduction is mostly higher than that of classification models after kernel principal component analysis dimensionality reduction,so AutoEncoder is more suitable for this paper to reduce the dimensionality of indicators for financial crisis prior-warning models.Regarding the selection of classification algorithm models,both the models under AutoEncoder dimensionality reduction and the models under kernel principal component analysis dimensionality reduction are found to be better than the random forest and XGBoost models,and the ROC curves are more to the upper left.Finally,the combined model of AutoEncoder and support vector machine is found to be the best,and for the metrics in this paper,the best results are obtained when reduced to 40 dimensions,with an F1 value of 0.922. |