Font Size: a A A

Anti-fraud Model Practice And Interpretability Research In The Field Of Internet Consumer Loan

Posted on:2022-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y M WangFull Text:PDF
GTID:2518306725483754Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In the decentralized Internet era,the "Internet + Finance" pattern has broken into the traditional financial industry with its advantages of low cost,high speed,transparent information and user sinking.As an important part of Internet Finance,the Internet Consumer Loan business has gone through several years of barbaric development,bringing a new concept of advanced consumption to the public,which effectively stimulated domestic demand.But at the same time,various problems have begun to emerge.The risk of fraud is the top priority of many derivative problems,so anti-fraud has become an indispensable part of the financial system.In view of the fact that digital financial fraud has formed a complete black industry chain,and is becoming more specialized,ganged,and concealed,traditional risk prevention and control methods that rely on manual experience have been difficult to meet the needs of accurate identification and management.Therefore,intelligent anti-fraud systems based on technology such as relationship graphs,biological characteristics,and machine learning have become the mainstream of today's era.Aiming at the anti-fraud problem in the field of Internet Consumer Loan,this thesis first studies the main anti-fraud methods proposed in domestic and foreign literature,especially the fraud detection method based on data mining algorithms,which mostly just focused on the recognition accuracy of fraudulent behaviors,lack of interpretation and analysis of the prediction results.Considering the financial industry has extremely high requirements for the intelligibility of the model due to its industry specificity,this thesis further summarizes the research results of machine learning interpretability technology,and provides theoretical support for the visual interpretation of the fraud identification model.In general,this thesis combines quantitative identification with qualitative analysis,and creatively proposes a digital risk control pattern that complements model identification and manual verification.Thus,while using the model to accurately identify potential fraud risks,the inherent recognition logic of the model is visually displayed,which fills up the deficiencies of previous studies to a certain extent.In the model recognition process,this thesis first carried out feature processing,feature construction,and feature selection on the original data set.Then combined with the integrated learning framework in machine learning,a RF model and a Light GBM model have been established to achieve batch prediction of fraud probabilities.At the same time,cross-validation method is used for parameter adjustment.Finally,the optimal fraud recognition model is selected by comparing the evaluation indicators such as KS value and AUC value of the two models on the same training and validation set.In the manual verification process,this thesis innovatively applies SHAP,an universal interpretable framework,to visualize the prediction results of the anti-fraud model from the perspective of variables and samples.Therefore,humans can evaluate the model in a targeted manner,and even find some valuable new clues to provide new ideas for anti-fraud work.Through empirical research based on two consecutive months of consumer loan data provided by a bank,this thesis chooses the Light GBM model with better prediction performance as the final fraud identification model,which participates in subsequent model interpretation.Through the visual interpretation of SHAP,it can be found that credit card consumption in the past year has the strongest influence on the model's fraud prediction results,and the higher the consumption amount,the more likely credit fraud will occur.In short,the anti-fraud pattern combined with machine learning and SHAP can not only realize batch identification of fraudulent users,but also further discover potential fraud characteristics of each client,which can provides valuable suggestions for lending decisions and finally let Internet Consumer Loans truly serve consumers.
Keywords/Search Tags:Internet Consumer Loan, anti-fraud model, integrated learning framework, SHAP interpretable framework
PDF Full Text Request
Related items