Font Size: a A A

Research On The Application Of Personal Credit Scoring Card Model Based On Multi-source Fusion Data

Posted on:2021-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:C SongFull Text:PDF
GTID:2518306314453524Subject:Statistics
Abstract/Summary:PDF Full Text Request
In recent years,China's economy has made a leap forward development,the people's demand for material culture has been constantly changing,and people's consumption concept has also changed greatly.The concept of "early consumption"is being accepted by more and more people,especially by young people.The heat of consumption loan is rising,which brings a series of problems,such as saving In the long loan and high default rate and other phenomena.On the one hand,some data that can be used to reflect credit are scattered in multiple financial institutions,so it is difficult to integrate them effectively,and the efficiency of data utilization is greatly reduced.Moreover,most of the previous researches focus on the innovation and improvement of methods,ignoring the exploration of data sets.On the other hand,from the perspective of model method,the current machine learning model has higher prediction accuracy for default population,but it has poor interpretation.The traditional credit score card model based on logical regression has strong interpretability and stability,but its accuracy is not as good as machine learning model.In order to solve these problems,this paper mainly uses multi-source data including user identity and property information,bank card information,transaction information,loan repayment information,loan information,loan application information,multi-source loan information to evaluate personal credit risk,and combines xgboost machine learning and traditional credit scoring card model based on logical regression into a fusion credit scoring card model In order to enhance the prediction accuracy and interpretability of the model in the multi-source data environment.The main work of this paper is as followsThe first part firstly describes the background of the topic selection and the theoretical significance of the research.Secondly,it studies and sorts out the literature on credit evaluation at home and abroad,and then summarizes the research content and the framework of this paper.Finally,it explains and summarizes the innovation points of this paper.The second part mainly introduces the relevant theories and methods of credit evaluation model,including the traditional credit score card model based on logistic regression model and the integrated learning methods such as xgboost,etc.,and makes a simple summary and description of the evaluation indicators of the model.In the third part,the basic ideas and research methods of multi-source data fusion are described.Firstly,the definition and characteristics of multi-source data fusion are explained.Secondly,the data processing methods under multi-source data fusion are described,including the treatment of data imbalance,data missing value and logical error.Finally,the characteristics of credit score under multi-source data fusion are discussed explain.The fourth part firstly introduces the basic characteristics of the multi-source data selected in this paper,sorts out the logical relationship between the multi-source data,removes the samples with logical errors,then fills in the missing data according to the different data sources,and constructs the traditional credit score card model and the single xgboost integrated learning model under the multi-source data respectively The two models were evaluated.The fifth part is the construction of credit score card model of multi-source data fusion.Firstly,the sub scoring model of multi-source data fusion is constructed.Secondly,the performance and stability of the model are evaluated.Finally,the traditional credit score card model,xgboost integrated learning model and multi-source data fusion credit card model are compared,The final results show that the credit score card model of multi-source data fusion is better than the traditional credit score card model and single xgboost machine learning model.The sixth part is the conclusion and Prospect of this paper,summarizes the main conclusions of this paper,and prospects the future research direction of multi-source data fusion.Through the research of this paper,we find that compared with single user consumption data,the fusion of multi-source data of users makes the fused data cover all aspects of users as much as possible,which can accurately describe the credit behavior of users.The innovation of this paper is to explore the construction of credit scoring card model from the perspective of multi-source data fusion.This paper integrates the data from different scenes and different sources,including the user's identity and property information,bank card information,loan repayment information,transaction information,lending information loan application information and multi loan information.By cleaning the multi-source data and handling the data imbalance,a credit scoring card model based on multi-source data fusion is constructed.
Keywords/Search Tags:Multi Source Data Fusion, Data Preprocessing, XGboost, Integrated Credit Score Card
PDF Full Text Request
Related items