Font Size: a A A

Personal Credit Scoring Model Based On Ensemble Strategies

Posted on:2021-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:X C LiuFull Text:PDF
GTID:2428330614954481Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The rapid development of C hina's Internet has led to the active online economy,the third-party payment is constantly updated,the emergence of various credit products,credit problems are more prominent.For the healthy development of China's credit economy,the second-generation credit system of the People 's Bank of China was put into trial operation from the end of 2018 to the beginning of 2019.The new credit system has increased the business of non-bank institutions and expanded the channels of data collection with the aim of using advanced technology to provide more efficient and accurate services,but the system is not yet mature and there is still a lot of room for improvement,especially at the technical level of data mining.This paper aims to further study the existing machine learning algorithms by combining comparative analysis with empirical analysis and optimize models by ensemble strategy,hoping to improve the identification and avoidance of personal credit defaults.The main idea of ensemble strategy is to optimize and integrate based models on the basis of the optimized and integrated model to build a model with stable performance and higher accuracy.The construction idea of this model: based on the full data cleaning,and then the model is optimized by the Bayesian optimizations for hyper-parameters,the accuracy,prediction ability and classification ability of the ensemble model are found to be good from the comparison of different types of model.Three further ensemble methods are carried out on this basis: Stacking,Blending,Voting.Finally,Analyze the robustness of the ensemble model from different model evaluation angles(precision,classification ability,generalization ability,etc.).The empirical analysis of Taiwan credit data,comparing the logical regression,the Bernoulli-based Bayesian model,the SVM(support vector machine),the r andom forest,the extreme tree model,Ada Boost,GBDT,XGBoost,and on this basis,the three strategies of Stacking,Voting and Blending are used to construct an efficient ensemble model.The results shows,the first,the method of serial ensemble model(GBDT)has similar effect with the model of heterogeneous ensemble strategies(Stacking,Blending,Voting),the second,the relatively complex Stacking model's prediction ability is not good as the relatively simple GBDT model's,which shows that the complexity of the model is not proportional to the accuracy,the third,ensemble strategy accuracy of Voting and Blending which structures are relatively simple finally get 81% and the AUC of the model reached 0.78 or more,and the classification ability index KS and Kappa scored 0.315 and 0.373 respectively,which are significantly better than the results of other single models.
Keywords/Search Tags:Personal credit assessment, Ensemble strategy, Machine learning, Data mining
PDF Full Text Request
Related items