Font Size: a A A

Research On Loan Default Problem Based On Imbalanced Data

Posted on:2023-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:H R HeFull Text:PDF
GTID:2569306623995549Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
There are various forms of Internet financial platforms,and the associated problems of credit risk and user fraud are becoming increasingly prominent;in business analysis,user credit rating is an important indicator to guide the decision-making of credit companies such as banks.Therefore,it is of great theoretical and practical significance to conduct a comprehensive and detailed research on user characteristics and establish a suitable model for in-depth analysis.However,in practical applications,default users are always smaller than normal users,and the data has serious unbalancedness,which greatly reduces the efficiency of traditional machine learning algorithms.This paper uses Give Me Some Credit data from Kaggle competition platform for empirical analysis,focusing on the treatment of unbalanced data from data sampling and algorithm improvement,clarifies the difficulties of unbalanced data processing and the advantages and disadvantages of various processing methods.The research process of data preprocessing,visual analysis,unbalanced data processing,model construction and optimization,and model fusion are described in detail.In the process of model establishment,in order to comprehensively utilize the different feature information of users to obtain a more accurate classification effect,First build the XGBoost and Light GBM models,and then based on the output of their leaf nodes,form a new feature input logistic regression model for training.The XGB_LR and LGB_LR models are obtained.Finally XGB_LR,LGB_LR and the improved random forest model Balanced Random Forest for unbalanced data are weighted and fused.The experimental results show that the comprehensive performance of the fusion model is the best.
Keywords/Search Tags:Online Finance, Unbalanced Data, BalancedRandomForest, Model Merging
PDF Full Text Request
Related items