Font Size: a A A

Comparative Study Of P2P Network Loan Default Forecasting Model Based On LightGBM And XGBoost Algorithm

Posted on:2018-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:J L ShaFull Text:PDF
GTID:2348330542988244Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the 21st century,big data and Internet finance have made tremendous progress.P2P,as an important part of Internet finance,takes advantage of Internet technology and possesses the feature of convenience,speed and transparency compared with traditional methods.With the rapid demand for microfinance,P2P is a powerful supplement and improvement to the traditional financial industry.P2P is the abbreviation of the English word peer-to-peer,the meaning of individual to individual,also known as peer-to-peer lending.It is a kind of microfinance by gathering up idle funds from different investors and lending to the borrowers.With the rapid development of P2P industry,P2P industry encountered many problems.That the lender is not in line with the bank loan standards or small businesses is the fatal injure of P2P industry.Chinese P2P industry appeared a large-scale failure loans and closures,which brought a huge loss of investors and a serious impediment to the development of Chinese P2P industry.The core of P2P is risk control,while the default forecasting model is the core of risk control.For a P2P network lending company,a true and reliable risk assessment of borrowers and a control at a lower level of overall default rate are not only the responsibility of investors,but also the key of a P2P company to be along-term business.Therefore,it is very important to research on the default forecasting model of P2P online lending platform and put forward corresponding improvement measures to guide the sound development of P2P industry.This paper mainly studies the default forecasting model of P2P online lending and its influencing factors.Because Lending Club is currently one of the largest P2P trading platform in the United States and its data are open and transparent,this paper chooses the transaction data of Lending Club over the years as the research data.First of all,make descriptive statistical analysis of the data from five angles,including the scale analysis,the level of the borrower,the period of the loan,the purpose of the loan,and the default rate analysis.Then,the data were cleaned by"multi-dimensional" and "multi-observation" methods.A multi-dimensional data set which included 480018 observations and 61 variables and a multi-observation data set which included 569338 observations and 24 variables were obtained respectively.To verify the robustness of the model,two datasets were split randomly into a training set and a test set.All the factors that affected borrowing were divided into four categories:loan details,economic status,credit status and personal information.Next,use the Python and R software respectively,the LightGBM algorithm and the XGBoost algorithm were applied to the P2P network loan default prediction model.The two models were applied to two data sets respectively,and the four results obtained were compared and analyzed in detail.Finally,the influencing factors that affect the result of default were sorted and analyzed.The results showed that,for the same algorithm,including both the LightGBM algorithm and the XGBoost algorithm,"Multi-observation dataset”was better than"multi-dimensional dataset" in classification efficiency.And for the same dataset,LightGBM algorithm was better than the XGBoost algorithm in predicting the classification results.Among them,the classification prediction result based on LightGBM algorithm and "multiple observation datasets" was the best,with a correct rate of 80.10%,which was 1.28 percentage points higher than the average performance rate of 78.82%of Lending Club platform historical transaction data.It rough estimated that if Lending Club used the LightGBM algorithm since its inception,it would reduced its default loan about $117 million.Therefore,it was significant and effective to use LightGBM algorithm to classify forecasting.Finally,the factors influencing the result of default were ranked and analyzed.The four types of factors influencing the loan ranking were descended order by importance:loan details>economic status>credit status>personal information.In summary,this paper maked the following suggestions to foreign P2P lending platform.On the one hand,the principle of setting interest rates should be adjusted.Reduce the interest rate of borrowers with lower credit standards to reduce the default rate.On the other hand,multinational development is a long-term development trend.Make the following suggestions for the development of P2P lending platform in China.In terms of government policies,it is suggested that the system of credit levy should be perfected as soon as possible and establish a punitive mechanism for P2P industry.In terms of P2P industry mechanism,it is suggested to establish a mechanism for safeguarding the rights and interests of investors in P2P industry,improve ecentralized investment mechanism of investors,gradually make innovations and transitions,and diversify service modes.LightGBM algorithm and XGBoost algorithm are the forefront of machine learning algorithms in recent years,widely praised by scholars from all walks of life.This paper applies LightGBM and XGBoost two algorithms to default forecasting model of P2P industry.On the one hand,it increases the choice of default forecasting model.On the other hand,it expands the application range of LightGBM and XGBoost machine learning algorithms.However,due to the research of LightGBM and XGBoost two algorithms are still in its infancy,and there are few related academic documents that can be referred to at present,so the research is inevitable deficiencies exist.This article is an attempt of LightGBM and XGBoost machine learning algorithm in P2P default forecasting model.The optimization of these two algorithms is limited to the adjustment of their parameters and it has not been studied to combine them with other algorithms or models.With the further study of these two algorithms by scholars in the future,it is believed that more optimized algorithms will be used in the default forecasting model.
Keywords/Search Tags:Keyswords, P2P Network Lending, Default Forecasting Model, LightGBM Algorithm, XGBoost Algorithm
PDF Full Text Request
Related items