Font Size: a A A

Research On The Application Of Boosting Algorithm Based On Improved SMOTE In Personal Credit Evaluation

Posted on:2021-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:X DuFull Text:PDF
GTID:2438330626954361Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the increase of credit business in China,the financial institutions and local economy flourish,but the credit risk is also continuous growth.In order to effectively avoid risks and reduce bank losses caused by misjudgment of customers,we need to explore better methods to evaluate personal credit.This paper uses the loan data of lending club in 2018 as the original data set,combines domestic and foreign literature and credit evaluation construction criteria,establish the indicator system and 50 variables were determined.The research content mainly includes two aspects:The first one: Aiming at the problem of feature selection in the index system,we innovative combine the PCA method with the reliefF method to reduce the dimension of variables,this method not only solves the redundancy of information,but also considers the recognition ability of each feature to category,which effectively improves the classification accuracy.The PCA-ReliefF method reduces those 50 variables to 20 principal components so that reduces the complexity of the model.The second one: Aiming at the problem of the imbalance of credit evaluation data,we innovative improve the SMOTE algorithm and proposes a new oversampling algorithm,which is called MS-SMOTE in this paper.Firstly,we use kernel distance to make linear interpolation more reasonable.Then,according to the distribution of the minority class,different interpolation rules are used to synthesize some new minority samples to change the imbalance degree of the data set,effectively improves the classification accuracy of the minority class.In this paper,Xgboost,LightGBM and Catboost are used to verify the advantages and effectiveness of MS-SMOTE algorithm.The results show that this algorithm can not only improve the classification accuracy of the minority class,but also effectively reduce the over fitting phenomenon of the model,which reflects the feasibility and promotion value of the algorithm.We also verified the advantages of PCA-ReliefF in classification and the results show that it can improve the classification performance of the model,which reflects its' value in classification problem.
Keywords/Search Tags:Credit evaluation, PCA-ReliefF, Imbalanced data, SMOTE algorithm, Ensemble learning
PDF Full Text Request
Related items