Font Size: a A A

Research On Personal Credit Risk Assessment Based On LightGBM Algorithm

Posted on:2021-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhuFull Text:PDF
GTID:2439330620480947Subject:Financial
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data and the rapid development of the credit industry,the competition among financial institutions has become particularly fierce.Commercial banks occupy an important position in the financial system,and credit risk has become the main factor affecting the development and stability of commercial banks.The establishment of personal credit evaluation model can help commercial banks quickly process a large number of credit applications and reduce operating costs.Therefore,this paper aims to build an effective individual credit evaluation model,compare and analyze the characteristics of some main credit evaluation model methods and machine learning methods,and help commercial Banks better manage credit risks.The data set is from the real credit data of a domestic commercial bank,including the basic information of users and loan information.In the preliminary preparation,descriptive statistical analysis,data cleaning and feature engineering were carried out for relevant information of the borrower,such as excluding abnormal data,filling in missing values,and derivation of feature variables.Then,five different algorithms,namely logistic regression,support vector machine,random forest,XGBoost and Light GBM,were used to construct individual credit risk assessment models,and the models were adjusted.The AUC(Area Under Curve)value and recall rate were used to evaluate and analyze the models.Next,the best Light GBM outputs are ranked in order of importance.Finally,in order to solve the problem of data imbalance,the data were processed by using the methods of oversampling,undersampling and combined sampling,and the influence of different sampling techniques on the effect of Light GBM model was explored.The following conclusions are drawn in this paper: 1.The AUC values of the five models are all greater than 0.75,indicating that these five models can effectively identify default behaviors by using multidimensional data.2.Light GBM has a good application effect in personal credit risk assessment.Its AUC value is 0.8953,and its recall rate,F1 statistical value and running speed are higher than other models.3.The application period,job type,age and loan products contribute a lot to the model.In the credit business scenario of commercial Banks,we should focus on these variables.4.Tomeklinks undersampling and random oversampling can improve the AUC value of Light GBM test set to some extent.
Keywords/Search Tags:Credit risk, Unbalanced data, Sampling technique, LightGBM algorithm, AUC value
PDF Full Text Request
Related items