Font Size: a A A

The Application Of Logistic And Its Improvement Methods In Personal Internet Credit Scoring

Posted on:2020-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2439330578453164Subject:Finance
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of China's economy and the change of people's consumption concept,credit consumption gradually occupies the core position in consumption activities.More and more people apply to commercial banks or financial institutions for credit consumption loans.Applicants are concerned about whether credit applications can be approved,while commercial banks or financial institutions are concerned about whether applicants can repay credit on time.By using credit scoring model as a personal credit scoring tool,commercial banks or financial institutions can predict whether the applicant is a default customer or a credit customer,and then determine whether the credit is granted to the applicant.How to accurately identify potential default customers and minimize the losses caused by customer credit default risk of commercial banks or financial institutions is the core problem that financial institutions have to solve.Therefore,it is of great significance to establish and improve an effective personal credit scoring model.In this paper,personal online credit data is used.When pre-processing is performed,the missing values are filled by a single padding method and a mode method,and the unbalanced data is balanced by the re-sampling method.On this basis,Logistic regression,Lasso-Logistic regression and the Lasso-Logistic regression based on Shapley Value are used to establish the personal credit scoring model.Based on the empirical analysis of personal online credit scoring data,the sample data is divided into training set and test set,and the prediction effect of the three methods in credit scoring is analyzed and compared.At the same time,we compare the forecasting effect of these three methods before and after adding the existing credit rating indicators of an online credit platform.Through the empirical analysis of these three models of personal online credit data,the results show that both Logistic regression and its improved methods have good robustness and predictability,and from the perspective of prediction accuracy,the prediction accuracy of Lasso-Logistic regression and the Lasso-Logistic regression based on Shapley Value is significantly higher than that of Logistic regression.This is because:the Lasso-Logistic regression method adds penalty terms to the logistic regression,which eliminates the relatively unimportant variables,reduces the complexity of the model,and has the highest prediction accuracy.The Lasso-Logistic regression method based on Shapley Value takes the relative contribution of each coefficient as its weight,which makes the model have the function of adjusting the regression coefficient,and the prediction accuracy of the model is higher than that of Logistic regression.At the same time,by comparing the prediction results before and after adding the existing credit rating indicators of an online credit platform,it is found that the overall prediction accuracy after Logistic regression,Lasso Logistic regression and Shapley Value Lasso-Logistic regression on the basis of platform credit rating is significantly higher than that without platform credit rating.It provides a basis for introducing Lasso and Shapley Value methods into personal credit risk assessment model,and constructing an appropriate and effective method for risk control based on user-related.data and anti-fraud identification.
Keywords/Search Tags:Personal Credit Scoring, Logistic Regression, Lasso Method, Shapley Value, Shapley Value Lasso-Logistic Regression, Prediction Accuracy
PDF Full Text Request
Related items