Font Size: a A A

A Study On Credit Default Prediction Based On Supervised Learning

Posted on:2023-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q WuFull Text:PDF
GTID:2530307073458224Subject:Finance
Abstract/Summary:PDF Full Text Request
In the financial credit sector,along with the rapid expansion of business volume and data volume under the wave of digital economy,the risks embedded in the credit industry are gradually exposed.Deliberate fraud and malicious default are key constraints to the healthy development of the credit industry,and in the face of such pressures,traditional risk management approaches can hardly strike a balance between efficiency and risk.This paper focuses on the pre-credit risk control process,and aims to investigate whether machine learning algorithms can improve the accuracy and efficiency of default risk probability prediction and make up for the shortcomings of traditional risk control,in order to alleviate the information asymmetry problem in the credit market.Meanwhile,in order to increase the applicability of the model,this paper focuses on the interpretability of the algorithm prediction,trying to make up for the shortcomings of the explanatory power of the black-box algorithm as much as possible,improve the degree of reliability of the model,and provide a higher guarantee for the large-scale application of the model.From the perspective of game behavior,this paper discusses the best decisions of different participants in different situations,proposes a path mechanism to establish a pre loan default prediction model,clarifies the economic significance of improving the ability to predict default risk and reducing user screening costs,and provides a new way to solve the adverse selection problem in the credit market.On this basis,this paper takes desensitization data published by a credit platform as the research object,explores the applicability of each model in the credit default risk prediction scenario by establishing a game model and four machine learning models based on Logical Regression,Random Forest,XGBoost,and Light GBM algorithms,and compares the prediction level differences between different algorithms through comparative experiments.At the same time,a new model integrating Light GBM and Logistic Regression features is established.This paper creatively combines the game theory and machine learning algorithm to find the following conclusions:(1)The income level of the lender is strongly related to the screening cost and interest income.The lower the screening cost,the higher the probability of the lender’s benefit,and the easier it is to lend.By establishing a probability of default prediction mechanism,it is not only helpful to fundamentally reduce the cost of the capital side,alleviate the problem of inclusive finance,but also effectively alleviate the problem of credit lenders "reluctance to lend",and alleviate the problem of inclusive finance;(2)The machine learning algorithm does have some advantages in the prediction of credit default.Among the four models discussed in this paper,the evaluation index results of Light GBM algorithm are particularly prominent;(3)Light GBM+Logistic Regression model improves the prediction accuracy of the algorithm;(4)The model independent interpretable method Tree Shap is applicable to financial credit default prediction scenarios.
Keywords/Search Tags:Credit risk control, Default forecast, Machine learning, Model interpretability
PDF Full Text Request
Related items