Font Size: a A A

Research And Application Of FL-lightgbm Algorithm Based On Unbalanced Data

Posted on:2021-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2439330626454370Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the change of people's consumption concept,the concept of "advanced consumption" has been recognized by more and more people,and consumer finance has ushered in a vigorous development.It allows users to pay for goods they can't afford in advance,accelerate the circulation of goods,and promote the development of economy to a certain extent.Consumer finance loan has the characteristics of small amount,no mortgage and no guarantee,which not only benefits more low-income people,but also brings the risk of loan default for consumer finance companies.This paper attempts to use machine learning method to predict the risk of user default and reduce the bad debt rate of consumer finance companies.The traditional machine learning method usually assumes that the data is evenly distributed,but the distribution of consumer finance loan data is unbalanced,that is,the number of non defaulting users is far greater than the number of defaulting users.In this case,the traditional algorithm will lead the model to pay too much attention to the samples of non defaulting users,leading to the misclassification of a small number of defaulting users,which causes great cost to consumer finance companies.Therefore,it is of great significance to study the classification of unbalanced default data in consumer finance loans.Based on the loan default data of home credit,this paper forecasts whether users default or not.Firstly,the data is preprocessed and analyzed,and new features are constructed according to the characteristics of the data.The top 150 features in xgboost model are selected to build the model.Secondly,xgboost and lightgbm model are selected to build the model respectively,AUC is used as the evaluation index,and lightgbm model is better than xgboost model as a whole.Finally,this paper improves the loss function of lightgbm,using focal loss as the loss function of the model,.The results show that the lightgbm model with improved loss function has better prediction effect on a small number of default samples.AUC value reaches 0.757144,only taking 43 S.
Keywords/Search Tags:Consumer finance, unbalanced data, loss function, lightgbm, focal loss
PDF Full Text Request
Related items