| In recent years,as China’s socialist market economy has shifted from high-speed development to high-quality development,our society has gradually put forward more requirements for a high-quality credit economy and a credit society.Nowadays,in people’s daily life,various personal credit transactions such as borrowing money,borrowing books,renting a car,and mortgage are everywhere.In the financial field,individual borrowing customers have become more and more important in China’s banking business and profit distribution.If the traditional credit scoring technology is still used to measure the individual’s credit status,it will cause the misallocation of personal credit amount,resulting in the bank’s high-quality loans not reaching the optimal disbursement ratio,and sometimes it will lead to misjudgment of credit and poor credit.The user judges that it is a high-quality user,which will bring huge default risks to financial lending institutions such as commercial banks and increase the unstable factors of the financial and economic markets.At the same time,with the progress of China’s science and technology and education,the fields of artificial intelligence and machine learning have also developed rapidly in recent years.In the financial field,a number of algorithms for processing personal credit scoring techniques have also been born,such as the logistic regression algorithm,SVM algorithm,random forest algorithm,and deep learning algorithms.With these methods,you can quickly and effectively improve the accuracy of financial institutions’ identification of user credit.After analyzing the above background,this article considers from the perspective of the intersection of two disciplines,economics and statistics,to study the establishment of models for personal credit user credit scores in the financial field,the field of personal credit scores.The problem of scoring model building.This article applies statistical algorithm ideas to the field of personal credit scoring in an attempt to establish a reasonable and effective personal credit scoring model with high accuracy and stability.Provide a practical and effective model for commercial banks and other financial and credit institutions to improve the accuracy of personal credit scores and reduce the risk of user default.In the introductory part of the first chapter,this article combs the research of many domestic and foreign scholars in the field of personal credit scoring,and finds out the main process of personal credit scoring research,including personal credit features selection,credit sample data set pre-processing and imbalance research.The establishment of personal credit scoring model and the evaluation of personal credit scoring model.In the second and third chapters,this article introduces the basic concepts of personal credit scoring,the basic concepts of integrated learning algorithms,the types of ensemble learning algorithms,and some commonly used ensemble learning algorithms.In Chapter 4,this paper proposes an improved Random-SMOTE algorithm to deal with the imbalance of the data set in the second stage of personal credit scoring.In Chapter 5,this paper proposes a personal credit scoring model with the XGBoost algorithm as the core,and uses German credit data set for experimental analysis.In Chapter 6,this paper proposes a personal credit scoring model with Stacking integration algorithm as the core,and uses the credit data set of Lending Club for empirical analysis.At the end of the article,the summary and expectations of the article are put forward.Finally,the personal credit scoring model based on the XGBoost algorithm and the personal credit scoring model based on the Stacking algorithm are experimentally analyzed on the German credit data set and the Lending Club credit data set.Compared with other commonly used algorithms such as SVM,Random Forest,GBDT,the two ensemble learning algorithms proposed in this paper have certain advantages in performance evaluation indicators such as accuracy,precision,recall,and F1 score and ROC curves,especially in terms of prediction accuracy.The proposed scoring model has better performance. |