Font Size: a A A

Credit Risk Assessment Based On Hybrid Ensemble Algorithm

Posted on:2021-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhaoFull Text:PDF
GTID:2428330605959042Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As one of the most widely accepted loan models by the public,credit loan development scale not only determines the financial development level of financial institutions,but also provides great convenience for customers in production and life.Credit loan is a common loan model,and the quality of customer credit has obviously become the core standard for financial institutions to judge whether to issue loans for them.How to construct an efficient and accurate evaluation model through algorithms to judge the possibility of customer default is an urgent problem to be solved and optimized.In this paper,the construction of personal credit evaluation model is mainly improved and optimized from the two aspects of credit data imbalance and integrated model construction.And through the public UCI credit data set to verify the performance of the algorithm and model,to contribute to the development of financial institutions' risk prevention and control level.First of all,with regard to data processing,customer information data obtained by financial institutions is diverse and unbalanced.Aiming at the problem of data imbalance,based on the traditional unbalanced data processing method SMOTE algorithm,this paper proposes an improved threshold synthesis less-class oversampling(Ts-SMOTE)algorithm.Near-neighbor samples are used to synthesize new samples.In the experiment,this method is used to construct a single Xgboost prediction model.The experimental results show that the proposed algorithm obtains higher G-mean and F-value value than the traditional SMOTE algorithm,which verifies the proposed algorithm effectiveness of processing unbalanced data.Secondly,in terms of model construction,this paper chose Xgboost as the base model.According to the research of previous scholars,the integrated model has the advantages of higher classification accuracy and stability than the single model.Aiming at the problem of the difference of the base model,on the basis of data balance,this paper designs the Bagging integrated model BXgboost model based on feature selection,which is to classify the features of the data,and design the base model of these features based on the Xgboost model.The parameter perturbation increases the difference between the models.Experimental results prove that the BXgboost model has improved performance in all aspects compared to the single Xgboost model.Finally,based on the model BXgboost,this paper proposes a hybrid integrated model PBX,that is,partitioning the data set for model training,and then using particle swarm optimization algorithm(PSO)to optimize multiple base models derived from BXgboost operation,the final result is obtained by multiplying the prediction result of each base model by the corresponding weight.At the end of the experiment,the performance of the PBXmodel proposed in this paper is lower than that of other models by comparing with the performance of other models,and the highest accuracy rate of the model reaches 81%.
Keywords/Search Tags:Credit Evaluation, Data Imbalance, Xgboost, Hybrid ensemble
PDF Full Text Request
Related items