| With the development of big data,a new financial model of "Internet + Finance" has gradually emerged,but the risk quantification capability of the new model is still insufficient.The purpose of risk quantification is to detect customers who may cause economic losses to financial enterprises as soon as possible,and financial enterprises can take relevant collection measures according to the early warning to minimize the losses of the enterprise.According to my internship experience in the bank’s technology credit department during my master’s degree in engineering,the current bank’s credit evaluation policy is mainly based on subjective scoring,using formulas to calculate the expected loss rate,and then calculate the risk cost rate.The scoring method is easily subject to the subjective will of the approver.influences.Today’s personal credit risk assessment methods are mainly based on data mining methods,among which the mainstream method is to use the stack ensemble model for assessment.Since the stacked ensemble model is based on multiple classification models,the training time is also the sum of the training of multiple classification models.If the feature dimension of the dataset is too high,the training time of the stacked ensemble model will be very long.And most of the current research directly specifies the basic learners of the fusion model,and does not consider the differences between the combinations of basic learners,which limits the performance of the stacked ensemble model.In addition,the personal credit dataset also has the characteristics of classification imbalance,which will lead to a decrease in the efficiency of the classification method.Therefore,in view of the shortcomings of existing research,this thesis designs a stacked ensemble prediction model based on ant colony algorithm optimization.The specific research work of this thesis is as follows:1.A preprocessing method of personal credit data set is proposed.XGBoost is used to select some city features for binary derivation,so as to avoid a large number of sparse features after one-hot encoding;use Smote oversampling to generate small-class data,so that the classification of the data set tends to be more For balance;use Light GBM to perform feature selection on credit data sets,reducing the amount of computation and time spent in model training.2.A stacking prediction algorithm based on ant colony algorithm optimization is proposed.Using the ant colony algorithm based on the concept of the ant-week model,the combination of the best base learners in the stacking ensemble model is studied,and the meta-learner is combined for personal credit default prediction,and the effectiveness of the model in this thesis is proved by experiments.When compared with other representative OICSM models,the model proposed in this thesis improves the AUC index by 5.8% on the data set,and the validity and rationality of the model in this thesis at different levels are verified through parameter analysis experiments.3.Use Py Qt5,QSS and SQLite database to realize the role management module,user login module,data analysis module,customer information management module,loan business management module,post-loan management module,query management module and credit default prediction of the personal credit management system The effect comparison visualization module,which realizes the function of real-time prediction of credit default customers in the post-loan management module,provides application scenarios and cases for the model proposed in this thesis.In summary,based on the existing research,this thesis performs data balance distribution processing and feature dimension reduction processing on the personal credit data set,and designs a stacked ensemble prediction model based on ant colony algorithm optimization based on the idea of ant weekly model.Experiments show that the model proposed in this thesis can effectively predict personal credit default,which provides a model and experimental reference for the credit default model of the company where the internship is located. |