Font Size: a A A

Research Of Personal Credit Scoring Model Based On Ensemble Learning

Posted on:2023-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:L XiaoFull Text:PDF
GTID:2568306806473354Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,along with the continuous development of personal credit business of banks and other financial institutions,the problem of financial fraud has gradually emerged.Therefore,it is of practical significance to predict user behavior based on user data analysis,develop corresponding strategies to optimize customer classification,and provide managers with more scientific pre-lending credit risk assessment.At present,the personal credit risk control approach has shifted from traditional manual review to one based on machine learning and deep learning.Most banks and other financial institutions currently use a single model based on logistic regression,decision trees and other risk assessment models.As the industry continues to develop,the performance of single models gradually fails to meet the increasingly complex fraud scenarios.In this thesis,a personal credit scoring model based on improved Stacking integration algorithm is constructed to address the needs of personal credit risk assessment,aiming to obtain a better performance wind control model.The main research contents are as follows.First,data processing.The data obtained in the credit business often has a lot of uncertainty and irregularity.In order to reduce the influence of data quality on model effect,missing values and outliers are analyzed and processed in this thesis.At the same time,considering that nondefault samples are richer than default samples in most cases in the field of wind control,the SMOTE sampling method is used to deal with the category imbalance problem.A complete solution is provided for the pre-processing of individual credit data.Second,a multidimensional feature screening strategy is designed.In order to obtain feature variables with stronger predictive power,this thesis scores the variables layer by layer based on two types of indicators,namely,feature importance and IV value,and designs formulas to rank the composite scores.The feature set with a contribution of more than 85%is selected as the final incoming feature variable,and the model is optimized from feature screening.The experiments show that the AUC value,KS value,and accuracy rate of the model trained by the screened dataset are improved compared with those before the screening.This shows that the screening strategy designed in this thesis has a positive effect on the performance improvement of the credit scoring model.Third,a personal credit scoring model based on an improved Stacking integration algorithm is proposed.The traditional Stacking integration algorithm cannot reflect the performance difference problem of base learners.In this thesis,the Stacking algorithm is weighted by discrimination and feature expansion combined with the credit scoring problem studied.And the personal credit scoring model and scoring conversion rules are constructed based on the improved Stacking algorithm,and the model output is converted into a credit score card,which provides a scientific basis for risk control and credit rating classification.The experiments show that by analyzing the model in terms of ROC curve,AUC value,KS and other evaluation indexes,the personal credit scoring model constructed in this thesis based on the improved Stacking algorithm is 0.86,0.583 and 91.29%in key indexes AUC,KS and accuracy.Moreover,the comprehensive indicator data is better than the traditional Stacking model and the single model.It shows that the model constructed in this thesis has better performance in credit scoring.
Keywords/Search Tags:Financial Risk Control, Credit Score, Ensemble Learning, Stacking
PDF Full Text Request
Related items