Font Size: a A A

Personal Credit Risk Assessment Based On Stacking Fusion Model

Posted on:2022-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q W DuanFull Text:PDF
GTID:2480306314970889Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the innovation and development of Internet information technology and financial system,many credit platforms combining "Internet and financial institutions"are emerging.There are two reasons for this kind of phenomenon.On the one hand,people are more willing to "consume ahead of time" than before because of the increasing consumption demand and the change of consumption concept;On the other hand,the combination of Internet technology and financial institutions makes online credit more convenient and efficient than traditional credit.But at the same time,it is also accompanied by the risk of network credit default.Credit default not only brings difficulties to financial institutions in capital turnover,but also affects the healthy and sustainable development of the whole national economy.Therefore,for China's financial institutions,it is urgent to establish a robust,efficient and accurate personal credit scoring model,improve the level of credit risk control and standardize and prevent credit risk.Based on the real credit data of an internet bank,this paper aims to build a robust,accurate and efficient risk assessment model for Internet banks.Firstly,four groups of models were constructed,including logistic regression,random forest,Xgboost and Lightgbm.Then build a better stacking fusion model based on the four groups of models.In the preprocessing stage of the original data,the following work has been done:processing of missing value,outlier handling,construction of behavior feature and data resampling.In the process of data resampling,this paper uses random down sampling method and smote up sampling method.Finally,it is found that smote up sampling is better than random down sampling.In the stage of feature construction and feature selection,the maximum value,minimum value and mean value are used to construct the behavior information features,and the random forest algorithm is used to eliminate the features with low importance in feature selection.The empirical results show that the fusion model based on smote-stacking framework not only improves the prediction accuracy of the model,but also improves the stability of the model.The fusion model based on smote-stacking framework is superior to logistic regression and three mainstream integration algorithms(Random Forest,Xgboost and Lightgbm)in AUC and KS indicators;in terms of stability performance,the fusion model based on smote-stacking framework is second only to logistic regression and has higher stability than the other three integration algorithms.Domestic financial institutions need to track and monitor the long-term risk of customers before and after granting loans to customers.Therefore,it is an essential link for all financial institutions in risk management to build an accurate risk control model based on customers' multidimensional credit data and predict customers' repayment in the future.The content of this paper can provide a certain reference value for it.
Keywords/Search Tags:credit evaluation, unbalanced sample, logistic regression, integration algorithm, stacking model
PDF Full Text Request
Related items