Font Size: a A A

Internet Financial Pre-lending Recognition And Model Expression Base On XGBoost

Posted on:2020-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:W H ChenFull Text:PDF
GTID:2428330590994745Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Based on the XGBoost model,this paper studies the problem of identifying users with pre-lending overdue risks in the Internet financial scenario,and visualizes the model through a powerful SHAP framework.In order to study this problem,this paper selects the public data set provided by Rong 360 enterprise,realizes the cleaning of variables,the construction,comparison and visualization of the model in the scene of 10,000 data volume and anonymous variables,so as to illustrate the XGBoost model.High accuracy and interpretability in pre-lending overdue scenarios.On the one hand,this paper comprehensively considers the nature of various variables,and regularly interpolates the missing values of anonymous variables to achieve the cleaning of variables and avoid the occurrence of Garbage In Garbage Out.On the other hand,by controlling the consistency of the training set and the test set,three models of Logistic Regression(LR),Random Forest(RF)and Gradient Boosting Decision Tree(GBDT)are constructed.These three classic models are used as baseline models.The accuracy comparison shows the superiority of the XGBoost model.In order to make the model evaluation indicators more suitable for the actual scene,this paper selects three categories of seven indicators to comprehensively judge the recognition effect of each model.The evaluation indicators include the sorting ability of the model,the ability to identify positive samples,etc.,and construct the expected return index instead of the conventional accuracy rate indicator.In order to improve the accuracy of the XGBoost model,this paper further performs hyperparameter tuning on the established XGBoost model.Finally,using the interpretation framework SHAP of the integrated model,the value of each variable in the model is visualized,visualized from the perspective of variables and samples,and simulated in the context of anonymous variables for better interpretation.Its economic management implications.
Keywords/Search Tags:Internet finance, pre-lending overdue recognization, parameter tuning, eXtreme Gradient Boosting, SHAP
PDF Full Text Request
Related items