Font Size: a A A

Research On Risk Assessment Model Of Personal Credit Application Based On Co-forest Algorithm

Posted on:2022-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:W X HeFull Text:PDF
GTID:2518306521981859Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The application of Internet finance in the field of credit is more inclined to provide small loans to individual consumers,and the service is more convenient and fast,the transaction cost is lower,the coverage group is wider,and the credit threshold is lower to achieve unprecedented development.The problems of user fraud and high defect rate also follow,how to mine user information and build a more accurate user risk assessment model become a widely discussed topic.In this paper,the risk assessment model of personal credit application is studied.Considering that the refuse inference problem has a partial effect on the credit model,it is proposed that the Co-forest algorithm should be used as the main learning model,and that the base classifier(random tree)with different generalization capabilities in the Co-forest algorithm has the same voting weight,and the Co-forest algorithm should be improved according to its method of predicting probability weighted voting for each class,and the optimized model performance has been improved by real data.Using Union Pay commercial credit overdue data as empirical data,after using random forest regression to fill the missing value,using variable correlation diagnosis and mutual information MIC value for feature selection,data normalization and other pre-processing work.Then,the data is measured by statistical methods such as chi square testing.Exploratory analysis is from six dimensions: identity and property status,card information,transaction information,lending information,repayment information and loan application information;Under the condition of different loan application rejection rate,the machine learning model such as Co-forest model,weighted Co-forest model,SVM and RF are established for the whole dimension characteristics,and the monitoring models such as AUC value,KS value,"good" and "bad" customer error rate are compared from the three aspects of model accuracy,category differentiation ability and error cost,and the results show that the optimized Co-forest model performance is better than that of traditional Co-forest,random forest and SVM.At last,in order to enhance the explanatory of co-forest so-called "black box model ",the co-forest model is established for each feature dimension.Through the classification effect evaluation of each feature dimension,it is concluded that in all the feature dimensions that affect the whole credit risk assessment model,the importance rank is as follows: transaction information,loan information,repayment information,application loan information,card holding information and identity and property characteristic information.
Keywords/Search Tags:Application for Personal Credit, Risk Assessment, Feature Selection, Co-forest model
PDF Full Text Request
Related items