Font Size: a A A

Study On The Risk Prediction Model Of User Loan Based On Machine Learning

Posted on:2021-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:C N ZhangFull Text:PDF
GTID:2428330611956470Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rise of the lending platform and the development of Internet finance,the user loan risk prediction of financial industry has become more and more important.Facing the sharp increase of data volume,the traditional financial industry has increased the audit cost,which has been difficult to carry out efficient data processing.With the development of computer technology,the emergence of machine learning technology in the era of big data provides us with more possibilities and conveniences.Facing a large number of loan users,there are also a series of risk management problems in the network loan platform,which cause the legitimate rights and interests of the platform side and the user side to be damaged.Therefore,relevant departments have also issued policies to restrict management to promote Development of loan platform.As for the lending platform,it should also use technical means to avoid risks,so it uses machine learning technology to establish prediction model to extract effective information and carry out risk prediction to effectively control risks and minimize losses.This paper discusses the application of machine learning in the field of Internet finance,aiming at the problem of user loan risk prediction.This research is based on the previous researchers,using the data set of user loans after desensitization provided by an Internet loan platform.The main contents are as follows:(1)First of all,the data preprocessing operation is carried out.The basic personal information and the corresponding loan related information data of the user are explored and analyzed.The data set is cleaned,including the processing of missing values and duplicate values,and the time stamps are supplemented;(2)In the aspect of Feature Engineering,we mainly deal with the features of data sets,use the method of feature cross combination to derive features,one-hot coding to some features,normalization to some variables,etc.feature selection uses randomforest algorithm to rank the top 15 features according to the importance of features,and completes the variable summary of the final input model;(3)Build the model and optimize the model,divide the training set into test set and verification set,take the new data set selected by the features as the input of XGBoost model,get the optimal parameters of the final model through parameter optimization and cross validation,and apply the model on the test set.The performance of the fusion model is evaluated,and the improved prediction model is compared with the logistic regression model and GBDT model,and the prediction effect of the new model is better than the other two models.(4)Through the experiment,this paper puts forward a kind of XGBoost model based on random forest to predict the risk of user loans.The model has better prediction accuracy.Finally,according to the results of the model and the background of the era with big data,this paper puts forward some reference suggestions for the identification of high-risk users in the network lending platform.
Keywords/Search Tags:Risk Prediction Model, Feature Engineering, Random Forest, XGBoost Algorithm, GBDT Algorithm, Logistic Regression Algorithm
PDF Full Text Request
Related items