Font Size: a A A

The Decision Tree Algorithm And Its Application In Credit Risk Control

Posted on:2014-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LiFull Text:PDF
GTID:2249330398959786Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In this thesis, we mainly consider the role of the C4.5decision tree al-gorithm in credit risk control.In order to reduce the prediction error, we use the binomial confidence interval estimation method to improve the pessimistic error pruning algorithm.To this end, we firstly describes the credit risk and its control method briefly;Then we describes the content of the decision tree, and pointed out its application in the control of credit risk;We C4.5decision tree algorithm theoretically,including its generated theories and its pruning theory,which are information entropy, information gain, pessimistic error pruning and im-proved pessimistic error pruning algorithm;Finally,with the tool of the Matlab software,we use the manufacturing data of the Evergrowing Bank to realize the algorithm and modeling, Verified the application of the decision tree on the credit risk control and its significant role in it with an example.By the results of examples, decision tree model has strong ability to iden-tify the credit risk so as to be able to better control credit risk;The indicators it is selected to determine the classification criteria are universal, have a strong persuasive. And the C4.5decision tree model has the following advantages:1. It is able to generate easy-to-understand decision-making rules, easily understood;2. It clearly shows the important decision-making indicators to facilitate the future decision-making judgments and accumulating data:3. Requiring Small amount of calculation, it greatly improve the computing speed:4. Not only can it deal with the small amount of data, but also it is able to handle the large amount of data;5. It can deal with continuous and discrete data simultaneously These advantages make us be able to use the limited amount of data to pre-dict the credit risk of the customers easily and accurately,which helps the bank credit risk management and regulatory agencies to timely and accurately grasp of the customer’s credit risk profile,so as to make timely initiatives to avoid or reduce credit risk.However, the decision tree model has poor robustness.Although its effect on the classification of training samples is very good, but the error rate in-creased considerably when it is used for classification of the holdout samples.In practice,a new loan applicant may belongs to different overalls with the model-ing samples.this may cause a high rate of inisclassification.Therefore.when we use the decision tree to forecast some new sample’s classification, we must pay attention to whether the sample be predicted with belongs to the same overall with the modeling samples.For this problem,we can modeling by industry,or we can identify which overall the sample to be predicted belongs by cluster analysis before prediction.
Keywords/Search Tags:the C4.5decision tree algorithm, credit risk, pessimisticerror pruning algorithm, binomial confidence interval estimation, realize thealgorithm
PDF Full Text Request
Related items