Font Size: a A A

Research On Optimization Of Personal Credit Default Probability Estimation

Posted on:2021-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:S W WangFull Text:PDF
GTID:2510306302953519Subject:Master of Finance
Abstract/Summary:PDF Full Text Request
In recent years,Fin Tech is growing rapidly due to the development of Internet technology and the gradual expansion of financial services coverage.The rise of the internet industry has caused unexpected external competitive pressure on traditional financial institutions,compelling these firms to start transformation and gradually apply Fin Tech to real use.In terms of credit,personal credit loans are characterized by its rapid growth in market demand volume,high frequency and relatively small amount,which requires pre-loan risk control to be more accurate and stricter to avoid potential loss.Early warning of potential credit defaults and the amelioration of credit information system gradually gain its importance in China.This research aims at measuring the probability of credit default in pre-loan risk control based on machine learning models,using Home Credit credit default dataset as data source,using Python to conduct data acquisition,dimension dividing,data pre-processing,machine learning model building and model evaluation.Combining both theory and practice of credit default and data science,this research will verify and explore methods to improve model result,summarize the experience and give practical suggestions.The model evaluation criterions include AUC as the core indicator,certain level of F1 Score,Recall and Precision,logistic regression model result as baseline.Since Light GBM had the best performance among XGBoost,Light GBM,and random forest,it is selected as the benchmark classifier for result analysis.According to the results of Light GBM,this research can draw the following conclusions: First,the personal credit default probability measurement belongs to the prediction category,it is necessary to combine financial credit theory,credit evaluation theory,and machine learning algorithm methodologies as application guidance.Besides,the comprehensiveness of personal credit default prediction indicators and dimensions,the effectiveness of detailed data preprocessing and model tuning,rather than simply algorithm optimization,can greatly improve the predicting accuracy.The second is that the explanatory power of machine learning models can refer to statistical models to give substitute solutions,which is also able to obtain the contribution for both features and dimensions of the sub-samples,improving the interpretability of the model while helping to locate model problems,promoting model iteration and optimization.The third is that the development of credit information industry has a fundamental role in improving the risk control capabilities of financial institutions.Based on these conclusions,this research proposes the following suggestions to financial institutions: The first is to build credit scoring feature system,constructing meso-dimension from both horizontal and vertical perspectives,cooperating with external credit agencies for credit data related to personal livelihood and socioeconomic activity.The second is to ameliorate steps in model building and model iteration,apply general data preprocessing steps as reference,then make adjustments based on the feedback from model evaluation,improve parameter tuning process and evaluation criterion system,combine random search and grid search via crossvalidation to figure the optimal parameter set out,improve the interpretability of the model and promote model optimization and iteration.As for the novelties of this research,in terms of theoretical research,this paper distinguished between predictive indicators and influencing factors of credit defaults,combined credit risk management theory with data science theory to increase the indicators and dimensions for personal credit scoring,re-examined the methodology of personal credit default risk evaluation,and improved the practical interpretation of machine learning models for personal credit scoring.In terms of the approach adopted,this paper restructured credit scoring feature system,quantified the effect of data preprocessing on model accuracy,improved the model evaluation criterion system and result measurement method,and ameliorated model output.
Keywords/Search Tags:credit default risk, machine learning, credit scoring
PDF Full Text Request
Related items