Font Size: a A A

A Study On Financial Distress Prediction Based On Imbalanced Dynamic Ensemble Model

Posted on:2023-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:T T RenFull Text:PDF
GTID:2568307124478574Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The in-depth development of economic globalization has brought sizable opportunities to listed companies,but also brought many uncertain factors to them.The growth of companies’ operational difficulty has increased the risk of financial distress.Once the financial distress occurs,it will not only cause financing difficulties and setbacks in management for enterprises,but also harm the interests of investors and affect the health function of the financial market.Therefore,constructing the accurate financial distress prediction model is an important task that needs to be solved urgently,and relevant research has become the focus of attention in the academic and industry.Effective prediction model can not only help enterprises to insight the enterprise’s situation and take relevant measures to improve the financial status,it can also provide reference for investors to adjust their investment strategies timely to avoid greater losses.Therefore,it is of great practical significance to give effective financial distress prediction model to the listed companies.In research field of financial distress prediction,there are far fewer companies in financial distress than healthy companies without financial distress.Such imbalanced data distribution often leads to the reduction in the accuracy of traditional classification model or even failure,thus losing the ability to predict the financial distress.In addition,a large part of existing financial distress prediction model is based on cross-sectional data.However,the development of an enterprise is not constant,and its operating state will often change.The frequent changes will make the model constructed based on old data no longer suitable for new samples,and also lead to the failure of the model.Intending to consider the above two problems at the same time,this thesis takes China’s manufacturing listed companies as the research object,uses machine learning method as the benchmark model,and build a financial distress prediction model based on imbalanced dynamic ensemble approach.Firstly,the cost sensitive model is constructed by introducing misclassification cost to solve the imbalanced data distribution.Aiming at the model deviation caused by imbalanced data distribution,this thesis introduces the misclassification cost to set different weights for different class sample by referring to relevant literature,and improves the basic model including logistic regression,support vector machine and decision tree model in machine learning algorithm to obtain the cost-sensitive model.The empirical study on cross-sectional datasets proves that cost-sensitive learning can effectively improve the classification accuracy of financial distress companies,performing well in evaluation metrices,and the model is also robust.Then,this thesis compares the cost sensitive learning at the algorithm level with SMOTE resampling technique at the data level to get the optimal model for solving the imbalanced data distribution.The resampling model is also constructed on the cross-sectional dataset and compared with the cost-sensitive model obtained above.The results show that both cost-sensitive learning and resampling techniques can effectively improve the classification accuracy of financial distress companies,and the former outperforms the latter.Finally,cost sensitive learning and dynamic ensemble algorithm of timeweighted factor are combined to build a imbalanced dynamic financial distress prediction model.At this time,the panel dataset is taken as the research object,the dynamic dataset is constructed by incremental expansion method,and the dynamic feature selection is carried out to verify the existence of concept drift problem.The ADASVM-TW model is introduced,which can consider the time-weighted factor of each sample while paying attention to whether they are misclassified by changing the updating formula of the sample weight.The cost-sensitive support vector machine was used as its base classifier to build the ADA-CSSVM-TW model,to deal with the problem of imbalanced data distribution and concept drift at the same time.Experimental results on dynamic datasets prove the effectiveness of the improved algorithm,and the rationality and reliability of the model are further verified by significance test and robustness test of changing data imbalance rate.
Keywords/Search Tags:imbalanced data, machine learning, cost-sensitive, time-weighted, dynamic prediction
PDF Full Text Request
Related items