Font Size: a A A

The Research Of Personal Credit Risk Assessment Based On Random Forest Model

Posted on:2019-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:H X Z SuFull Text:PDF
GTID:2429330545451607Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology and the rising of personal loan demand,market potential is of China's personal credit products,personal consumer credit industry has become a hotspot of financial competition in the market.When Banks and financial institutions develop personal credit products,the main problem to be solved is credit risk.However,at present,China's financial institutions lack a deep study on credit risk assessment,and the personal credit investigation system is still in the early stage of exploration.How to effectively assess and measure the level of personal credit risk is the key for Banks and financial institutions to reduce bad assets from the source.Therefore,study the characteristic of the personal credit risk level,establish a good personal credit system,select the appropriate individual credit risk assessment model,is China's financial markets in the healthy development of personal credit industry core subjects.In this paper,the random forest model is selected to evaluate the personal credit risk.This model can tolerate noise well,effectively prevent over-fitting and fast operation.Compared with the traditional risk assessment model,it can better solve the problem of personal credit risk assessment.This paper adopts the famous American P2P company Lending.Club2017 third quarter of 42,535 samples as the basic data set,the establishment of personal credit system and evaluation model.Firstly,missing value processing,normalization and correlation testing were performed.Secondly,the feature selection method was optimized,and Boruta feature selection algorithm was introduced to screen out reasonable index variables.Then,use SMOTE algorithm to optimize the training set,improve unbalanced degree of the data,improve the prediction accuracy of negative samples,and on the basis of the established on the basis of the random forest model of personal credit risk assessment method;Finally,the random forest model is compared with the logistic regression model.Results show that both original imbalance data sets and balance after dealing with the SMOTE algorithm of data sets,personal credit risk assessment based on random forest model results are superior to the individual credit risk assessment based on logistic regression model.In the training set after dealing with the SMOTE of balance data modeling,the accuracy of samples of two kinds of learning models have significantly increased,negative class training set of random forest model test set of sample prediction accuracy and 0.7%to 79.5%from 2.0%and 2.0%respectively,the logistic regression model to predict the negative samples of accuracy and 0.0%to 62.6%and from 0.1%to 0.0%;Random forest model prediction accuracy of negative samples not only promoted effect is much better than logistic regression model,the overall accuracy of 84.7%and 82.4%to logistic regression model prediction accuracy is significantly higher than 62.6%and 63.2%.This fully proves the applicability and validity of random forest model in personal credit risk assessment.
Keywords/Search Tags:Personal credit risk, Random forest model, Boruta algorithm, SMOTE algorithm, Logistic regression model
PDF Full Text Request
Related items