Font Size: a A A

Research On Risk Assessment Of Class Imbalance Personal Credit Data Based On Improved GAN

Posted on:2024-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:Z T DengFull Text:PDF
GTID:2568306920996159Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The data class imbalance problem refers to the large differences in the amount of data in the sample with different class labels,which is not an ideal balanced distribution.This situation is often found in classification problems,such as fraud detection,medical diagnosis,risk warning,etc.Such problems can have an extremely negative impact on the accuracy of classification algorithms.Personal credit risk assessment is a risk warning problem,and the credit level of individuals has been and is an important factor for financial institutions to measure the default risk of individuals.The rapid economic development in China since this century has promoted the improvement of people’s living standard and also changed the consumption concept of our residents,and the demand for personal consumer credit is increasing day by day,but the default risk brought by it is also rising,which gradually becomes a major risk faced by the current financial institutions and affects the stability of the whole financial system.Traditional credit risk assessment techniques can no longer solve the problem of large volume of personal credit data and redundant information today.For the development of financial institutions,it is necessary to establish an efficient personal credit risk assessment model,i.e.,an assessment of the default risk or credit rating presented by the customer’s predictable future repayment ability based on the customer’s credit history data,as a reference for lending decisions.The line of research in this thesis is to abstract personal credit risk assessment as a dichotomous classification problem and use machine learning classification models for training and testing.To address the problem of data class imbalance in risk assessment,i.e.,the number of samples of non-performing loans is much smaller than the number of samples of high quality loans,this thesis is based on the generative adversarial network algorithm(GAN)in deep learning technology to learn fewer class samples to learn to generate forged samples,so that the number of samples can be expanded to achieve class balance.At the same time,a new improved GAN method is proposed by designing a more effective model loss function by drawing on Guided-Loss idea.Specifically,an empirical study is conducted using 582,732 lending transaction data on a foreign personal loan platform.In these data,firstly,144 variables were filtered and dimensionality reduction processed;secondly,for the sample imbalance problem,the improved GAN method was used to fit the data using support vector machine as a classifier after balancing the data,and a comparison experiment was conducted with the traditional method of dealing with class imbalance problem;finally,in the evaluation stage,this thesis also proposes a new multidimensional evaluation index focusing on fewer classes to more Finally,in the evaluation stage,we also propose a new multidimensional evaluation index focusing on fewer classes to more comprehensively evaluate the classification effect after the balanced treatment of each method.The experimental results show that the improved GAN method can more fully utilize the information of few class samples and perform better in multiple evaluation indexes with multiple public datasets,which can be more systematically enhanced in the personal credit risk assessment work of financial enterprises and expand the research and application of the GAN method in the field of personal credit risk assessment of financial institutions.
Keywords/Search Tags:credit risk assessment, class imbalance, generative adversarial network, deep learning
PDF Full Text Request
Related items