Font Size: a A A

Combination Model Of Personal Credit Risk Assessment Based On Feature Engineering

Posted on:2022-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:T T QiFull Text:PDF
GTID:2480306557966739Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Personal credit consumer loan plays an increasingly important role in the credit loan business of commercial banks in China,but the risk of personal credit default in China is increasing year by year.In order to build an effective credit risk control tool,financial institutions collect user information extensively to build credit data warehouse,but this often leads to problems such as large sample size,high dimension,sparsity and redundancy of individual credit data.In this context,the efficient processing of users’ credit data and the construction of effective personal credit risk assessment methods have become the key to solve this problem.In this paper,the characteristics engineering processing is carried out on the personal credit data with 105 features and 50,000 samples provided by a bank,and the data dimension is reduced and the data quality is improved through abnormal and missing value processing,feature derivation,feature selection,feature box sorting and coding,so as to strengthen the data preparation before modeling.Secondly,this paper selects XGBoost,Lightgbm and Random Forest to construct a single model of personal credit risk assessment.By combining the results of the single model of personal credit risk assessment and the related overview of the portfolio model,the architecture of the portfolio model of personal credit risk assessment in this paper is determined: Based on difference algorithm and feature disturbance XGBoost1,Lightgbm1,XGBoost2 and Lightgbm2 four single model,and adopt the weighting method,minimum variance method,such as performance evaluation method,the grid search method to determine the weight of a single model,the performance evaluation method and the grid search method in the model adopts the model of integrated performance evaluation indexes rating,so as to build four linear combination of the personal credit risk assessment model.In addition,in order to evaluate the improvement effect of the characteristic derivative method on the individual credit risk assessment model,another four single models and four combination models were constructed based on the non-characteristic derivative individual credit data according to the same combination architecture.Finally,respectively,compared with(not)features derived from a single model predicted results,there are(not)the characteristics of derivative of single model and combined model prediction results,there are(not)the combination of the characteristics of derivative model predicted results found that: there are four single model derivative feature of comprehensive performance,AUC value and KS are better than no derivative characteristics of four single model,on average,comprehensive performance ratings,AUC and KS increased by 2.24%,1.39% and 1.07%respectively;The overall performance and AUC of the combined model were better than those of the single model under the two feature sets,and KS was also significantly better than that of most single models.The comprehensive performance score and KS value of the combined model determined by weight based on the comprehensive performance score of the model are better than those of other combined models.Therefore,the theoretical analysis and empirical results show that in the individual credit risk assessment,the proposed composite model with derivative features based on the weight of the model’s comprehensive performance score has a better predictive performance.
Keywords/Search Tags:Personal Credit Evaluation, Combination Modeling, Feature Engineering
PDF Full Text Request
Related items