Font Size: a A A

Default Risk Assessment Of P2P Network Lending Based On Deep Forest

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2428330632457463Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a representative of the "Internet Finance" model,P2P network lending had opened up a convenient financing channel for borrowers and investors.However,with the rapid development of the P2P network lending industry,P2P network lending defaults frequently occurred,which made large number of P2P network lending platform go bankrupt.this not only harmed the legitimate rights and interests of investors,but also endangered the security of the Internet financial industry and social stability.Aiming at the problem of low prediction accuracy,low F1 value and low AUC value of existing machine learning algorithms in default risk assessment,this paper used the deep forest algorithm to construct default risk assessment model of P2P network lending.the main work of the thesis was as follows:Firstly,data preprocessing and feature selection.this paper took the historical lending transaction data set of Lending Club platform as the research object.First,the original data set was cleaned,then the target variables and feature variables of the data set were determined,and the target variables were divided into default and performance categories.Second,the feature variables were divided into continuous feature variables and discrete feature variables,the continuous feature variables were normalized,and the discrete feature variables were divided into ordered discrete feature variables and disorderly discrete feature variables,the ordered discrete feature variables were encoded by natural numbers and normalized,and the disordered discrete feature variables were one-hot encoded.third,this paper used variance and mutual information indicators for feature selection.Secondly,Establishing default risk assessment model of P2P network lending based on deep forest.This paper selected gradient boosting decision trees,random forests,extreme gradient boosting trees and completely random forests as learners to build deep forests.Each layer of learners received the original feature information and the output feature information of the previous layer,and outputted the processing result to the next layer.The input of the model was the P2P network lending data after data preprocessing and feature selection,and the output was the probability that the borrower would be in default.Thirdly,Comparison and analysis of risk assessment models.This paper selected logistic regression,linear discriminant analysis,decision tree,K-nearest neighbor,naive Bayes,BP neural network and Adaboost integrated decision tree and deep forest for model comparison experiments.The experimental results show that the F1 value of the deep forest model was 0.6790,the accuracy was 0.6824 and the AUC value was 0.7595.Compared with logistic regression,linear discriminant analysis,decision tree,K-nearest neighbor,naive Bayes,BP neural network and Adaboost integrated decision tree,deep forest model had better predictive performance in P2P network loan default risk assessment.
Keywords/Search Tags:P2P network lending, credit risk, feature selection, deep forest
PDF Full Text Request
Related items