Font Size: a A A

An XGBoost-Based Ensemble Learning Approach To Personal Credit Risk Assessment

Posted on:2019-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:X G LiFull Text:PDF
GTID:2428330542999337Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the problem of constructing the classifier for credit card data,"good" and "bad"customers have severely imbalanced data characteristics.Under the framework of sampling methods,the methods for resolving data imbalance are mainly undersampling,oversampling,or the combination of above.Under normal circumstances,undersampling will cause information loss,and oversampling can easily lead to overfitting.This article proposes a quasi-bagging method based on the XGBoost method and the idea of ensemble learning.The method is simple and easy.It randomly groups the samples of majority class,and uses samples of majority class in each group and a certain proportion or all samples of minority class to establish a sub-model.The final result is the mean or vote of sub-model.This method draws on the ensemble learning idea in the bagging method and uses all the sample information in the training set to construct the classifier,which has a higher model accuracy.Each group of sub-models adopts XGBoost method which is based on gradient boosting.Furthermore,the method is discussed to have consistency and other properties.The results of empirical analysis show that this method has better classification results than some existing methods.
Keywords/Search Tags:credit risk, XGBoost, imbalanced data, ensemble learning, quasi-bagging, ROC curve
PDF Full Text Request
Related items