Font Size: a A A

The Research Of P2P Loan Default Prediction Based On Data Mining Technology

Posted on:2021-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ChenFull Text:PDF
GTID:2428330611465910Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Peer-to-peer lending,abbreviated as P2 P lending,is an important type of internet finances.As an alternative method of financing,peer-to-peer(P2P)lending enables borrowers to obtain loans directly from investors.After the 2008 financial crisis,individuals find it difficult to request for small-amount loan and small businesses can rarely get funded from traditional investors.Thus,it brought a huge demand for small-amount loan in loan market,and P2 P lending gained significant momentum from it.Owing to its low cost and covenant-lite advantages,P2 P lending emerges and becomes an important way for small-amount loan and private lending.However,P2 P lending is facing insufficiency of traditional financial client information and depends highly on the efficient credit reporting system and credit assessment,which causes significant higher default risk compared to traditional lending coordinated by financial institutions.To diminish the default risk,it is important for P2 P lending platforms to build efficient evaluation model of default risk through data.The loan default prediction is to predict whether a borrower will default or not,which concerns about the survival of a P2 P company.Most studies try to enhance the performance of default prediction models by improving the classification methods without considering missing values and the presence of imbalance data in the dataset.In this study,we focus on dealing with the issues of missing values and data imbalance to improve the performance in loan default prediction.This paper provides a missing value imputation approach which is based on describing categorical attributes as ordinal ones via Bayesian method,then utilizes the center of each class,standard deviation and the distances between class center of each class for the later imputation.Moreover,for dealing with imbalance data,we adopted the method which is hybrid undersampling that combines the clustering,the stochastic sensitivity measure and the radial basis function neural networks.Validated on 34 datasets,which includes 15 numerical datasets,11 categorical datasets and 8 mixed datasets collected from UCI Machine Learning Repository,and a real loan default data from a P2 P company in China,the experimental results demonstrate that our approaches have advantages in yielding a better performance.
Keywords/Search Tags:Loan Default Prediction, P2P, Missing values imputation, Imbalance data, BCCMVI, DSUS
PDF Full Text Request
Related items