Font Size: a A A

Personal Credit Risk Assessment Under Unbalanced Data Sets

Posted on:2022-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:X HuangFull Text:PDF
GTID:2518306323496514Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Credit is no longer just a matter of personal morality,but has become people's invisible ID card in the Internet age,which has an important impact on personal life,even social resource allocation and economic development.In recent years,with the development of Internet finance and the increase in personal consumption demand,the demand for personal credit loans has increased day by day,and the subsequent risks of default have also increased.This is not only the main risk faced by lending companies,but also one of the factors leading to the instability of the entire financial system and the economy and society.It is not only an important issue faced by enterprises,but also one of our core issues in preventing financial risks that how to use existing data mining technology and data resources for accurate personal credit risk assessment and management.Traditional credit risk assessment technology mainly relies on personal credit investigation.Obviously,it does not meet the needs of diversity and comprehensiveness in the era of big data.Traditional methods are somewhat powerless in the face of complex and time-sensitive massive data.The advent of the data era has brought richer data resources to enterprises,but it has also complicated the problem.How to mine valuable information from the rich data resources and accurately assess personal credit risks has become a of core problem for financial enterprises.We use the data of Lending Club,a well-known P2 P online lending platform in the United States,as an example to model and evaluate personal credit risk control this paper.In order to realize the assessment of personal credit risk better,we use statistical methods combined with cutting-edge machine learning and deep learning technologies.First of all,we establish profiles of the users' personal credit based on the principle of user profile,which helps to make up for the incomplete expression of personal credit information in the traditional evaluation model.Analyzing the differences of users' personal credit portraits helps to provide more reference and basis for users' credit risk assessment.Secondly,the cluster-based under-sampling method is used to solve the problem of data imbalance,which can make the model be better stable.This is one of the focus of this paper.Then we combine classification tree algorithm,IV method and recursive feature elimination method(RFE)to realize the importance ranking of features,perform feature selection,and reduce the dimensionality of data.Finally,we use the random forests model and the Logistic regression model to fit the data respectively and obtain a good fitting effect,which proves the rationality and effectiveness of the personal credit analysis and evaluation process in this paper.In addition,we compare the classification effect of random under-sampling method in processing data under the two models,and proves the rationality of clustering under-sampling method this paper.So far,a complete set of personal credit risk assessment process has been established to provide reference for related research in the future.
Keywords/Search Tags:Credit risk assessment, User profile, Unbalanced data, Feature selection, Random forecasts model, Logistic regression model
PDF Full Text Request
Related items