Font Size: a A A

XGBoost-based Online Loan Risk Prediction

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:K L YuFull Text:PDF
GTID:2428330614954489Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the increasing loan business of banks and financial institutions,bad debts have brought a great negative impact on the development of the Internet financial market.How to formulate a strategy to predict whether customers will default and maximize profits is the most concern of all credit institutions.Therefore,establishing an accurate credit score card model on how to reduce non-performing loans is of great significance to the stable development of the market economy.In the background introduction and data preparation,first,the concept of credit risk and the development background of P2 P credit are introduced,and several machine learning model theories and credit score card theories are explained.Secondly,for the personal credit data set of an Internet financial platform,through data exploration,data preprocessing,feature engineering and other operations,data that can be used for modeling is generated.In the empirical analysis of the model,first,a baseline model and XGBoost model were established.The evaluation indicators AUC value and KS value were used to study the model's ability to discriminate overdue customers.The XGBoost model was found to be superior to the three baselines of logistic regression,decision tree and random forest model.Secondly,for the imbalanced nature of data,by introducing a cost-sensitive learning strategy,an improved version of XGBoost model is proposed.Compared with the original XGBoost model,the improved XGBoost model has improved the prediction accuracy of overdue risk.Finally,for the XGBoost model,using Stacking fusion techniques,a new XGBoost model is proposed.Compared with the original XGBoost model,the new model's AUC value and KS value have been improved to a certain extent,and the prediction of overdue risk is more accurate.
Keywords/Search Tags:Loan risk, Logistic regression, Decision tree, Random forest, XGBoost
PDF Full Text Request
Related items