Font Size: a A A

Analysis Of Personal Credit Evaluation Of Network Credit

Posted on:2021-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:H Q WeiFull Text:PDF
GTID:2439330602483509Subject:Financial
Abstract/Summary:PDF Full Text Request
Driven by Internet finance,the network credit industry is booming.By integrating traditional finance and the Internet,a large number of investors and borrowers are encouraged to participate in network credit with its advantages of high investment yield,rapid loan process and convenient operation methods,etc.Although the network credit industry develops rapidly,there are many problems to be solved.Large-scale platform run-out,withdrawal difficulties,closure,transformation,bankruptcy and other phenomena will have extremely adverse effects on both lenders and borrowers as well as the network credit market.Although the industry shows a strong development momentum,it faces many risks and challenges,such as credit risk,compliance risk and technical risk.Among them,the frequent occurrence of credit risks is particularly prominent.Therefore,the establishment of a more accurate model to assess the borrower's credit default risk through big data technology is of great significance for protecting the interests of lenders,realizing the safe operation of the platform and the healthy development of the industry.Due to the large amount of data involved in the problem of credit risk assessment of network credit,the characteristic variables are complex and most of them are non-linear.In different data environments,the application of a single algorithm will be interfered by different data characteristics,while different algorithms will analyze data from differeent aspects.Therefore,the classification effect of the model can be greatly improved by integrating multiple algorithms and utilizing the complementary utility of different algorithms.Among the existing ensemble algorithms,LightGBM algorithm is a classical Boosting algorithm,which has high classification accuracy and fast running speed,and can achieve better prediction effect by combining weak classifiers.In addition,RB-SMOTE method can not only easily prevent the problem of over-fitting in oversampling,but also effectively solve the treatment of fuzzy boundary and the thorny problem of unequal distribution in small samples.Therefore,this paper proposes LightGBM algorithm based on RB-SMOTE,which can greatly improve the classification effect of loan default prediction.The data used for empirical analysis in this paper are from the transaction data of 2019 on the Lending Club platform,with a total number of 518,107 observations.Firstly,exploratory data analysis is carried out on users' personal basic information and relevant loan information,and qualitative analysis is made on the main factors affecting credit default risk.Then,a credit risk identification model is established by using various machine learning methods.Faced with the problem of seriously unbalanced data distribution,RB-SMOTE algorithm is used to make the data balanced.Then three machine learning models,namely Random Forest,Adaboost and LightGBM,were used to assess the credit risk of borrowers.Through comparative analysis of three models,the results show that the LightGBM model based on RB-SMOTE has the highest prediction accuracy,the lowest time cost and prominent advantages in mass data processing.Then,in order to prove that the proposed model has stronger generalization ability,and can be applied to the domestic network platform,domestic loan related data is further used to compare and analyze the differences between Chinese and American online loans in four aspects:supervision mode,fund custody mode,risk control management system,and data sharing degree.Through constructing risk identification model,it shows that the LightGBM model based on RB-SMOTE is superior to other models,the conclusion of this paper is verified,and the above research content is supplemented.By constructing the LightGBM model based on RB-SMOTE and applying it to the credit risk assessment of the network credit industry,the following conclusions are obtained.Firstly,in comparision with Random Forest model and AdaBoost model,the LightGBM model based on RB-SMOTE has a significant optimization effect and is applicable to both Chinese and American network credit platforms.Secondly,the operation results of the model are analyzed in detail.The LightGBM algorithm based on RB-SMOTE can significantly improve the AUC value,F1 value and K-S value of the classifier.Taking the Lending Club platform as an example,the model in this paper can improve the AUC from 0.814 to 0.954,the F1 score from 0.709 to 0.822,and the K-S value from 0.755 to 0.841,with a significant optimization effect.Thirdly,the feature screening method is different from the traditional method,which has certain enlightening significance.In the process of feature screening to achieve dimensionality reduction,instead of directly selecting the important variables identified by the traditional risk control mode,the LightGBM algorithm is skillfully used in this paper to screen variables of relatively high importance.The method is fast and easy to interpret,and which meets the requirements of big data risk control business.Fourthly,the data disclosure of China's network credit platform is insufficient.By comparing the results of empirical analysis based on domestic and foreign data,it is found that when using the data from domestic platforms for modeling,the effect of risk identification is far less than that of foreign data,which is mainly caused by the large difference in the level of data disclosure between the two national network credit platforms.Finally,based on the perspective of data risk control,this paper proposes the following four suggestions for the development and supervision of domestic network credit industry.Firstly,implement the data sharing mechanism to create a new situation of win-win cooperation;Secondly,open source data,brainstorm and create a more accurate and effective risk control system;Thirdly,expand the sample data dimension,build user portrait to provide accurate service;Fourthly,improve the relevant laws and regulations on data disclosure to curb information leakage.
Keywords/Search Tags:Network Credit, Personal Credit Evaluation, Data Imbalance, LightGBM Algorithm
PDF Full Text Request
Related items