Font Size: a A A

Credit Risk Assessment Based On Multi-Model Fusion

Posted on:2024-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Q MaFull Text:PDF
GTID:2558307079991429Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
As a new form,Internet finance has flooded into the market,affecting and changing people’s consumption concept and financial management concept.More and more consumers have developed the habit of consuming ahead of time and borrowing money.In addition,the fast and convenient characteristics of the credit platform also meet the borrowing needs of consumers.All the above factors provide “fertile soil”for the development of credit institutions.The accompanying risks also let it into a“deadlock”,the emergence of defaulting users,for the platform has brought significant economic losses.Therefore,it is of great significance to identify possible defaulting users and build effective risk control models to reduce credit risks.In this paper,starting from the actual data of credit platform,after data cleaning,feature coding,missing value filling,feature selection based on IV value and feature elimination based on correlation coefficient,comprehensive sampling SMOTE-Tomke unbalanced data processing,Four models,Logistic regression,random forest,XGBoost and BP neural network,were constructed and their parameters were optimized.In order to better extract deep information and improve the effect of the model,a CNN feature extractor without pooling layer is designed in this paper.The extracted features are used to train the above four models,forming a combined learning mode in which deep learning is responsible for feature extraction and machine learning is responsible for result classification.On this basis,two different ways of Blending models are also used,namely,the tandem blending method and the parallel weighting method.Finally,SHAP model is used to explore the importance of global features and find the default factors.LIME model was used to explore the influencing factors and judgment basis of a single sample.The main conclusions of this paper are as follows:(1)Model fusion.In essence,Blending belongs to the tandem model,while the weighted combination belongs to the parallel model,which is the blending method of two different models.According to the results,compared with the single model,the accuracy,F1,AUC,KS and other indicators of the two groups of models have been improved to a certain extent,indicating that the classification ability of the combined model is better than that of the four single models,which can better evaluate credit risk and reduce losses.(2)Feature extraction.CNN can better play the advantage of feature extraction in image data,and it also plays a role in the structured data of this paper,especially after removing the pooling layer.Compared with the original model,the model after feature extraction improves the effect of the model to a certain extent.(3)Default factors.Both the user bad debt analysis,SHAP model and LIME model indicate the importance of loan term,sub_grade,emp_length,mort_acc,verification_status and other features to credit risk assessment.Therefore,credit examiners should also pay attention to these factors,carefully examine and reduce risks.
Keywords/Search Tags:credit risk, machine learning, feature extraction, combination model
PDF Full Text Request
Related items