Font Size: a A A

Research On Personal Credit Risk Evaluation Model Of P2P Online Lending

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Y YuFull Text:PDF
GTID:2428330602483558Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
P2P online lending emerged at the beginning of the 21st century and developed under the social background where the financial industry and the Internet technology have gradually matured.It is an important part of private inclusive finance and Internet finance.It is a continuation and innovation in the field of microfinance,and it is a supplement to the traditional credit business structure.Since its emergence,it has developed rapidly around the world.However,P2P online lending started late in China.The imperfect credit system and the lack of relevant laws and regulations made the industry have serious potential capital security problems.With the big data era coming,it is of great practical significance to extract useful information from massive data,build an effective and reliable credit risk assessment model to accurately predict the situation of default,and improve the risk monitoring and identification ability of P2P platforms and investors,which is conducive to the healthy and stable development of the industry.Although there have been a lot of research achievements in the credit risk assessment of P2P online lending,most of them are focused on a single model.With the continuous improvement,the room for performance improvement of a single model is very limited.Recently,model combination has been highly praised for its better prediction effect,but there are few relevant researches on this aspect.Therefore,in this paper,the traditional statistical method Logistic regression and the emerging method of random forest in machine learning are respectively adopted to establish single models,and then we try to combine the single models.The method used in this paper and the model obtained can improve the research in this area.In this paper,the lending data of Lending Club,a P2P platform in the United States,were selected as the empirical data set.The original data were preprocessed and the variables were screened by WOE and IV,which eliminated the interference of irrelevant variables for the later modeling and improved the modeling efficiency.Next,the model was constructed based on the method of Logistic regression.In this process,to solve the problem that independent variables in Logistic regression are prone to multicollinearity,the principal component analysis and the Lasso method were combined with Logistic regression respectively,so the principal component analysis-stepwise regression-logistic regression model and the lasso-logistic regression model were constructed.The results showed that there was little difference between the two models based on Logistic regression.The classification effect of the models was not ideal,but the stability of the models was good.Then,the random forest model was constructed and the model achieved a better classification effect through the selection of important variables and the adjustment of parameters.But the stability of the random forest model was much worse than Logistic regression.Therefore,we tried to combine the Logistic regression model with the random forest model in the way of parallel combination and serial combination.By comparing the effects of single models and combined models,it is concluded that the serial combination model combines the advantage of each single model well,which can not only achieve better prediction effect,but also reduce the instability of the model to some extent,and the comprehensive performance of the model is the best.The results of this paper show that compared with the single models,the serial combination model constructed in this paper has a better effect on the credit risk assessment of P2P online borrowers,which can be used as a reference for domestic P2P platforms and investors to assess the credit risk of borrowers.
Keywords/Search Tags:P2P online lending, Credit risk, Logistic regression, Random forests, The model combination
PDF Full Text Request
Related items