Integrated Method Of Personal Credit Risk Assessment For Online Loans

Posted on:2023-03-07

Degree:Master

Type:Thesis

Country:China

Candidate:Z Q Xu

Full Text:PDF

GTID:2558307100972039

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

Nowadays,many Internet financial institution credit platforms have risen with the development of network information and finance.But innovation is always accompanied by risks,and online credit is no exception,and credit defaults not only bring difficulties to the turnover of lending platforms,but also have a negative impact on the healthy and sustainable development of the national economy.Therefore,the establishment of a stable,accurate and differentiating personal credit risk assessment model is urgently needed by China’s financial platforms in order to control and prevent credit risks.This paper is based on the real credit data of the US P2 P online lending platform from the first quarter of 2018 to the third quarter of 2020,which provides a more stable and efficient personal risk assessment basis for the online lending platform.The operations when preprocessing the original dataset are as follows: unrelated feature deletion,missing values and outlier processing,data format conversion,encoding processing of character data,and evenly distributed samples.When equalizing the samples,the two resampling methods of random downsampling and SMOTE upsampling were used to deal with the unbalanced training set,and finally the results of random downsampling were better than those of SMOTE upsampling by comparing the performance of each model.Variance filtering,correlation testing,and random forest algorithms were used to gradually eliminate highly relevant and unimportant features in feature selection,respectively,and the final remaining feature dimension was 19.This paper first uses the grid search algorithm to find the optimal parameters of each model to improve model performance,and then constructs Logistic regression,stochastic forest algorithm,light GBM model to construct a single model,and measures the prediction performance of each model by comparing the accuracy rate(AUC),precision,stability and KS index values of each model,and finally uses the Voting framework to obtain a more effective fusion model based on three sets of single models.Compare the performance of the ensemble model with different metrics.The empirical results of this paper show that(1)at the data level,after data cleaning and feature engineering,there are still 19-dimensional features,and the model performance established after random downsampling of the training set is better than that of MOVE upsampling;(2)at the model level,the fusion model based on the Random Und SamplingVoting framework improves the prediction accuracy,stability and risk discrimination ability of the general model to a certain extent.And the fusion model based on the Random Under Sampling-Voting framework is better than logistic regression and random forest in AUC,but it is slightly lower than the roboting algorithm Random Under SamplingLight GBM,and the performance is optimal on the KS indicator,indicating that the Voting model has the best differentiation,and the Random Under Sampling-Light GBM model has the highest accuracy In terms of stability performance,the fusion model based on the Random Under Sampling-Light GBM framework is second only to the traditional learning algorithm Logistic regression,and has higher stability than the other two integrated algorithms.In the risk management of each credit institution,it is essential to control the default risk of the borrower,screen more high-quality customers and track and monitor the long-term risk of customers,build a precise and stable risk assessment model and predict the repayment of the borrower,so the research content of this paper provides a certain reference value.

Keywords/Search Tags:

P2P personal credit assessment, unbalanced sample, logistic regression, integrated algorithm

PDF Full Text Request

Related items

1	Personal Credit Risk Assessment Under Unbalanced Data Sets
2	Research On Logistic Credit Risk Evaluation Model Based On Sample Information
3	Research On Personal Credit Evaluation Based On Credit Platform Data
4	Design And Analysis Of Personal Credit Scorecard Based On Logistic Regression
5	The Design And Implementation Of Personal Credit Evaluation System Based On Neo4j Database
6	Research On Internet Personal Credit Riskprediction Based On Machine Learning
7	Application Of Data Mining In Personal Credit Risk Identification Of P2P Online Loan
8	Application Research Of Data Mining Technology In Personal Credit Score Prediction
9	Personal Credit Score Modeling And Analysis Based On Data Mining
10	Prediction Of Personal Credit Default Risk Based On Machine Learning