Font Size: a A A

Risk Evaluation Of Mutual Fund Loan Based On CatBoost Algorithm

Posted on:2024-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:G P LiuFull Text:PDF
GTID:2568307070451844Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the continuous development and popularization of Internet technology,Internet finance has become a new financial form.Internet finance adopts Internet technology and financial innovation,breaks the barriers of the traditional financial industry,and provides users with more convenient and low-cost financial services.Internet finance not only meets the personalized financial needs of consumers,but also provides more financing channels for small,medium and micro enterprises,and promotes economic development and growth.Internet finance has developed in my country for more than ten years,and it has gradually transitioned from the barbaric and extensive style in the past to a period of stable development.Among them,major Internet financial platforms are also paying more and more attention to risk control.How to accurately assess customer risks has become an urgent problem to be solved.This is also related to the healthy and sustainable development of platform companies.In the past,platform companies mostly used scorecards and expert models,but there is great room for improvement in the accuracy and efficiency of the models.With the development of big data and artificial intelligence,some Internet finance companies have gradually adopted machine learning algorithms for related applications.Current research mostly focuses on the detailed improvement of a single algorithm model,and lacks in data quality analysis and feature engineering of Internet financial platforms.In view of the current situation,this paper starts from the pre-loan customer classification scenario,with the help of Lending Club’s historical credit data,based on The cutting-edge CatBoost model is improved with the Stacking algorithm to build a fusion model and complete the pre-loan forecast.The main work content is as follows:(1)Data processing and exploratory analysis of customer data based on feature engineering.Before constructing a model for empirical analysis,we first conducted data mining and descriptive analysis on the Lending Club dataset,sorted out the characteristics of each dimension of the dataset,clarified the distribution of target variables,and studied important variables such as loan amount and interest rate.Then preprocess the original data to solve the problem of missing values and outliers in the original data,eliminate irrelevant variables,perform one-hot encoding on discrete features,use SMOTE algorithm to deal with imbalance problems,and use correlation matrix for feature screening,use feature derivation technology to complete the expansion of business variables,and finally complete the entire feature engineering work.(2)Design the XG-CatBoost fusion model algorithm.This paper first selects the integration algorithm,then uses the idea of Stacking to improve on the basis of CatB oost,and designs a structure with CatBoost and XGBoost as the base model.In order to deal with interpretability and prevent overfitting,logistic regression is used as a two-level metamodel.Finally,in order to reduce the impact of the imbalance problem,we design the loss function,introduce weighted cross entropy,and finally design and complete our XGCatBoost fusion model.(3)Pre-loan forecast model construction training and experimental analysis.On the basis of feature engineering,this paper divides the original data into training set and test set,and uses 5-fold cross-validation to target the cutting-edge CatBoost,logistic regression,random forest,GBDT,XGBoost,and our designed XG-CatBoost models.Empirical analysis is carried out,and the accuracy rate,AUC and KS value are used as evaluation indicators.The experimental results show that for the pre-loan prediction scenario,CatBoost is more accurate in customer classification prediction in single model comparison,and the XG-CatB oost fusion model we designed is better than the original model in terms of accuracy,AUC value and KS.
Keywords/Search Tags:Pre-loan forecast, Feature engineering, Catboost, Fusion model
PDF Full Text Request
Related items