Font Size: a A A

Study On Default Discriminant Model Of Individually-Owned Business Loans Based On Credit Sub-Features Analysis

Posted on:2023-05-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:1529307031977969Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Loan default discriminant means that financial institutions such as commercial banks build mathematic models to describe the relationship between loan customers’ features and default statuses based on historical data,aiming to judge the default statuses of new loan customer and assist in decision-making of loans.Whether the default discrimant model is effective is related to the profit and risk of financial institutions.By the end of 2021,there were 103 million registered individually-owned businesses in my country,which has created 276 million jobs.It can be seen that individually-owned businesses play an extremely important role in China.This thesis takes individually-owned business loans as the research object and establishes loan default discriminant models based on loan data of a national commercial bank,aiming to reduce information asymmetry and alleviate the existing "expensive financing and difficult financing” problems of individually-owned businesses in China.In this thesis,three scientific problems are involved in the research of loan default discriminant models for individually-owned businesses,as follows.The first problem is the construction of credit sub-feature varibles(also known as“binning”).It refers to splitting a feature such as “income” into three different sub-features of "high income","middle income" and "low income",which aiming at using sub-features,rather than features,as variables for building loan default discriminant models.Obviously,loan customers with the sub-feature of "high income" have stronger solvency than “low income”,so sub-features have stronger explanatory power.The problem of the credit sub-feature construction is how to splitting the feature into credit sub-features(intervals),so that the sub-feature represented by different intervals can significantly distinguish the two types of customers in default and non-default.The second problem is the selection of optimal model combination.When building a data-driven default discriminant ensemble model,different discriminant methods for the same sample will have different accuracies.For multiple subsamples in the majority voting method,how to select the best model of each subsample to form the optimal model combination in the ensemble model.The third problem is determining the optimal cutoff point in semi-supervised default discriminant model.The discriminant model established using loan data with partial default status and partial default status is called semi-supervised model.In practice,some loans have espired,some have not yet expired,and the unexpired loans do not have default status,so the semi-superviesed models are widely used.When the cutoff point of default discriminant model takes different values,the accuracy will also be different,so there must be an optimal cutoff point which can maximize the distinction between default and non-default customers.How to determine the optimal cutoff point in semi-supervised model.The main innovations of this thesis are as follows:(1)Innovation of sub-feature construction: based on the sum of the difference between the proportion of default and non-default loans in each value interval of feature,the total information value is constructed.Then by maximizing the total information value,the optimal dividing point of interval is reversed which ensure that each splited credit feature can significantly distinguish default and non-default loans,changing the existing research that the "binning" method divides the feature interval arbitrarily.(2)Innovation of optimal model selection: With the goal of maximizing AUC,this thesis finds an optimal model corresponding to each sub-sample,then uses the majority voting method to judge loans,to ensure that the overall discriminant ability of the ensemble model is strong.Through multiple testing methods of multiple data sets and multiple models,it is verified that the default discriminant model established in this thesis has the best performance.(3)Innovation of semi-supervised default discrimnant model: by maximizing the G-mean,the optimal cutoff point is obtained to ensure that the semi-supervised model has a strong ability to judge default.The three levels’ findings of “Sub-feature”-“feature”-“criterion layer”of the default judgment system from small to large are as follows:(1)In term of credit sub-features: we find that customers with credit sub-feature of "the number of guarantors≥1" and "the profit rate in last month≥0.488" are less likely to default,while customers with " the number of guarantors=0" are more likely to default.The most important credit sub-feature is “the number of guarantors≥1”,"the number of guarantors=0" and "the profit rate in last month≥0.488".(2)In term of credit features: Individually-owned business with two types of occupations of "clerks and related personnel" or "business and service personnel" are more likely to repay on time,while "professional and technical personnel","inconveniently classified practitioners" or "unknown" three types of occupations Professional individually-owned business are more likely to default.These are the five most important occupational characteristics in the optimal combination of sub-features.The three most important features in the judgment of loan default of individually-owned businesses are "number of guarantors","profit rate in last month" and "occupation" in order.(3)In terms of criterion layer: the "financial status","personal information " and "basic loan information" in individually-owned busisness loans are all very important,while "business status" and "macro environment" have less impact.
Keywords/Search Tags:Individually-owned Businesses, Individually-owned Businesses Loans, Default Discriminant, Credit Sub-features, Optimal Model Selection
PDF Full Text Request
Related items