Font Size: a A A

Prediction Model Of Personal Customer Default Rate Of Financial Institution Using Ensemble Learning Algorithm

Posted on:2021-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:X S XuFull Text:PDF
GTID:2518306476454524Subject:Financial engineering
Abstract/Summary:PDF Full Text Request
As of 2019,China's total retail sales of consumer goods reached 41.2 trillion yuan,and consumption contributed 57.8% to economic growth,driving GDP growth by 3.5 percentage points.It has been the first driving force for economic growth for six consecutive years.At the same time,the scale of personal credit business is also increasing,and many risks are gradually exposed.Among them,credit risk is the most important and complex risk,so it is of practical significance to predict whether the borrower will default.The main purpose of this thesis is to predict whether a loan will default,and select personal credit data from a financial institution in China to construct a classification prediction model for empirical research.The data set includes 60,000 samples and 226 fields.First,data cleaning and feature engineering are performed on the data set.After preprocessing,the number of features in the data set is reduced to 200;second,using the the Logistic model,support vector machine,and CART are used respectively Decision tree,random forest algorithm,XGBoost algorithm,Cat Boost algorithm and Light GBM algorithm establish a default prediction model.The performance of the classification model is measured by AUC indicators.The results show that the Boosting algorithm performs best,followed by the Logistic regression model,followed by the RF algorithm and support vector.Machine,and finally the CART decision tree.Finally,based on the blending and stacking model combination ideas,the above four best performing models are combinated.Based on the model fusion ideas,The research results of this thesis is the two-layer ensemble algorithm default prediction model which integrates the Boosting integrated learning algorithm and the traditional binary logistic regression model.The results show that the AUC index of the combination model is better than the above-mentioned boosting algorithm and traditional default prediction model.The process of constructing the combination model has guiding significance for the prediction of the default of individual customers,and also has some reference value for the risk control of financial institutions.
Keywords/Search Tags:Fraud Detection, Ensemble Learning, Boosting, Combination Model
PDF Full Text Request
Related items