| In the age of information,people are exposed to more and more resources and the ways for accessing them are becoming more and more convenient and fast.There is a phenomena of customer loss in all walks of life including the banks.With the vigorous development of the Internet and the saturation of the credit card market,the credit card business of banks has also been greatly affected.Maintaining the existing customers,predicting as well as avoiding the losing of customers and improving trustworthiness of customers has been challenges for banks.For the problem of customers losing,this paper adopts the Light GBM model and the modified Light GBM model.Specifically,based on the credit data on Kaggle,10 variable ware screened out after feature cleaning,feature coding and feature engineering.This thesis builds Light GBM model for credit customer losing and compares this model to random forest,decision tree and logist regression.The comparison results imply the Light GBM is the best among all the considered models.For the problem of unbalance data,from the angle of data,four methods in including random undersampling,random oversampling,SMOTE and SMOTEENN are considered.Among them,the oversampling performs the best in terms of improving the precision of estimating model.Also for the problem of unbalance data,from the angle of algorithm,the method of Focal Loss is used instead of CE Loss.With tuning the parameters of Focal Loss,the learning of positive and negative samples,unbalanced samples and difficult and easy samples are accomplished.In this thesis,the accuracy rate,recall rate,F2-score and AUC are used for evaluating the built models,and it is found that the FL-Light GBM model has a remarkable ability to identify churn customers.Finally,FL-Light GBM model outputs feature importance and churn prediction probability.According to the predicted probability,customers are divided into five grades,and the corresponding suggestions are given to customers with different risk levels. |