Font Size: a A A

The Application Of Boosting Family Algorithm In Credit Rating Card Model

Posted on:2022-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2518306509489294Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In the background of big data era,the huge amount of interactive data information is constantly updated and precipitated,which brings a more comprehensive reference information dimension for the Internet financial industry,and also puts forward new challenges.Credit scoring system is a milestone achievement in the financial field.Along with the trend that credit transaction has become a new promotion mode in the commercial closed-loop,it provides decision-making standards for the banking system,financial institutions and consumer loan companies.Credit scoring card as a powerful grasp in the risk control of the financial industry,early through the logistics regression model learning,development has been very mature.On this basis,how to maximize the effective information,through the latest data mining technology to improve the user's credit evaluation,has a certain practical significance.Based on the excellent performance of boosting algorithms in improving the robustness and accuracy of the model,this paper explores the performance of the three algorithms xgboost,lightgbm and catboost in the credit scoring model,and compares their improvement in the evaluation index KS and AUC and the degree of improvement by combining with the logistics algorithm.The first chapter introduces the basic situation and key issues of the topic selection;The second chapter is the development process and research trends.The first part introduces the development process of the scoring system and the scoring card from the direction of the scoring card system,as well as specific cases.In the second part,the domestic and foreign scholars' research on score card modeling is reviewed;The third chapter introduces the basic concept definition and classification of credit score card;The fourth chapter introduces boosting algorithm from three aspects: algorithm principle,algorithm characteristics and modeling parameters;The fifth part uses two real data sets for empirical analysis;The sixth part is the summary and prospect.In the first mock exam,according to the first mock exam of two data sets,data cleaning and feature selection are used.After using XGBoost,Light GBM and Cat Boost algorithms,a single model is set up for two data sets.After that,Blending is used to build the fusion model.It is found that the single model established by Boosting algorithm has a certain improvement over Logistics.The fusion model is the first mock exam.The Light GBM algorithm is the best in the first dataset.The Blending fusion model performs best in the second algorithms.In the first mock exam,we use SMOTE algorithm to process the imbalanced data sets and get new data sets.Compared with the original data,we find that the oversampling is not obvious to the single model,or even the evaluation index of some models is decreasing.The first mock exam and the oversampled data are compared with the results of the three single models in the Boosting family algorithm.On the basis of selecting different parameters,it is found that the AUC value of Bayesian parameter adjustment is the best in the test set.
Keywords/Search Tags:Credit Score Card, Boosting Family Algorithm, Model Parameter Adjustment, Fusion Model
PDF Full Text Request
Related items