Font Size: a A A

The Research Of Chinese P2P Lending Default Prediction Model Based On Data Mining Technology

Posted on:2017-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:J W LiuFull Text:PDF
GTID:2348330488958122Subject:Business management
Abstract/Summary:PDF Full Text Request
Along with the Internet+thinking which increasingly spread widely in recent years, go deep into the social from all aspects of life, the Internet has brought great changes to the financial industry. The P2P lending platform comes from the booming at the beginning and now facing the situation of fierce competition. P2P lending (Peer-to-peer) allows individuals borrowing through the platform from a collection of individuals, this borrowing is commonly unsecured loans, and without the involvement of Banks and other financial institutions.Because of its own characteristic, P2P platform, in the process of lending to borrowers and lenders don't need the addition of the third party institutions such as Banks, risk prevention and control mechanism basically rely on personal credit.rating system, but there is no clear legal regulations in our country, the borrower's credit information collection cost is very high, citizen credit reporting system is not sound. Due to the asymmetry of information, the use of their borrowing and repayment willing, solvency is very clear for the borrows, but the lenders are not completely has all the information of the borrowers, in this case, moral hazard and adverse selection phenomenon is very common, leading to frequent borrowers default fraud. Lenders can not accurately judge the degree of the credit risk, and can't trust the platform, may eventually lead to low efficiency P2P market operation, in the long term, P2P industry development will inevitably severely hampered. By the end of 2014, there are over 896 problem P2P platforms, involving more than 8billion RMB.By introducing data mining algorithm, this study use imbalanced data to compare the properties of several kinds of classic data mining algorithm, finally choose the random forests algorithm to construct model. Random forests model can add all information about borrowers to explain variables, and don't need to encode variables, normalized processing, cross inspection and refined, thus can avoid removing hidden effective information. In this research, we collect about 130000 data from a Chinese P2P network platform, after data preprocessing, we finally obtain 122804 valid data, using the random forests algorithm constructed P2P lending default prediction model. After that, we use the indictor "Dist" to optimized the model. Seen from the results of the study, the forecast model on the test set of prediction results show that the model has good performance, Precision=0.978, Recall=0.7002, AUC= 0.803. The default prediction model conform to the requirements of the P2P lending risk control in our country, and provide certain reference value for our country's P2P lending risk control.
Keywords/Search Tags:P2P Lending, Information Asymmetry, Data Mining, Unbalanced Data, Default Prediction Model
PDF Full Text Request
Related items