Credit Risk Analysis Of Bank Users Based On Machine Learning Algorithm

Posted on:2022-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:J C Wang

Full Text:PDF

GTID:2518306527952319

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

With the rapid development of the financial industry,various ways of credit consumption have penetrated into people’s lives.When the scale of credit consumer groups is expanding rapidly,major financial institutions are also facing severe credit risk issues.At present,credit risk has become one of the important factors affecting the stable development of banks in the future.Therefore,research and analysis of credit risk can help banks effectively identify potential fraudulent users and reduce bank loss.Under the credit lending scenario of Xia Men International Bank,this thesis selects the real data of users as the research object of the article.The characteristic variables of the data cover basic user personal information,historical borrowing and lending behavior information,etc.In the preliminary preparation stage,this thesis preprocesses the data set,including outlier detection,missing value filling,category feature coding and so on.In the feature engineering stage,this thesis visually analyzes the feature variables of the data set based on the python visualization tools,excavates some potential user fraud information from the user area code,academic code and other features,and constructs related feature variables.Then according to the correlation coefficient method and the feature importance ranking method,40 features are selected as the input features of the subsequent model,and the data is balanced by the SMOTETomek Links method.In terms of models,Support Vector Machines,Random Forests,XGBoost,and Light GBM are used to construct user risk assessment models,and the grid search method is selected to optimize the parameters of the models.The results show that under AUC,F1-score and other evaluation indicators,the Light GBM has the best performance,where AUC and F1-score are 0.836 and 0.723 respectively.In order to optimize and upgrade the user credit risk assessment model,this thesis finally adopts the Stacking model fusion method.In the first stage,the method selects Random Forest,XGBoost,and Light GBM which have better performance as the basic classifier.In the second stage,the method chooses the Logistic Regression to train the result of the first stage.The final result shows that most evaluation indexes under the Stacking model fusion method are better than those of all single models,and the AUC value is 0.842.Through the analysis and experiment of bank credit lending data,this thesis constructs four machine learning models successively,and builds the user credit evaluation model based on the Stacking method.The final model performs well,and it has certain reference significance for user credit risk assessment.

Keywords/Search Tags:

Credict risk, Feature engineering, XGBoost, LightGBM, Model fusion

PDF Full Text Request

Related items

1	Early Warning Of Enterprise Talent Loss Risk Based On XGBoost Model Fusion
2	Research On Enterprise Credit Assessment Based On Model Fusion
3	Research On 5G Telecom Customer Prediction Based On Data Mining
4	Based On Big Data And Integrated Learning Research On Prediction Of Malware Infection
5	Study On The Risk Prediction Model Of User Loan Based On Machine Learning
6	Multi-factor Stock Selection Scheme Design Based On XGBoost And LightGBM Algorithm
7	Comparative Study Of P2P Network Loan Default Forecasting Model Based On LightGBM And XGBoost Algorithm
8	Design And Implementation Of P2P Financial Risk Control System Based On Dig Data
9	Prediction Model Of Housing Mortgage Loan Prepayment Risk Based On LightGBM Algorithm
10	Construction And Application Of Credit Risk Control Model Based On Deep Neural Network