Font Size: a A A

Research On Credit Evaluation Of Feature Engineering Based On Ensemble Learning

Posted on:2021-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:S W LiFull Text:PDF
GTID:2428330602471497Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
As the contemporary “economic ID card”,credit is an inevitable product of social and economic development.The establishment of a sound credit evaluation system helps to provide accurate “portraits” of customers to banks and other financial institutions,facilitating the provision of corresponding financial services,and accelerating the construction progress of social credit system which prevents credit risks and improves the efficiency of financial resource allocation.With the development of data mining technology and machine learning algorithms,data-driven credit evaluation methods have promoted the progress of credit evaluation technology,and building more efficient and robust credit evaluation models has become a hotspot of current research.In order to determine the important features of credit risk assessment,improve the construction of credit evaluation feature engineering,and establish an efficient,robust and generalized credit evaluation model,according to the characteristics of the large sample size,high feature dimension,and category proportion imbalance in the credit dataset,suitable feature variables are selected by feature engineering and other processing of the credit dataset,which help to obtain the input features and training dataset for model training,optimizing the modeling process of the credit evaluation model.In this paper,combining Boosting and Bagging ensemble ideas,a Bagging ensemble credit evaluation model based on XGBoost algorithm(B-XGB)is constructed.Through empirical comparison with logistic regression,GBDT,random forest and XGBoost models,it is found that the performance of the proposed model is improved to some extent,and the ensemble model has a better performance in discriminating and predicting “good” and “bad” customers.By combining different sampling methods to balance the credit dataset,it is found that the method combined with different sampling methods can effectively improve the model's ability to distinguish “bad” customers and have higher recall rates,which has a guiding role for the prevention of credit risks.
Keywords/Search Tags:Credit Evaluation, Ensemble Learning, Feature Engineering, Imbalanced Dataset, Model Evaluation
PDF Full Text Request
Related items