Font Size: a A A

Credit Risk Assessment Based On Machine Learning

Posted on:2012-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:H Z WeiFull Text:PDF
GTID:2218330362953625Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Credit risk assessment is crucial for financial institutes. Machine learning algorithms can significantly improve the accuracy of credit risk assessment. In this thesis, we propose three new credit risk assessment models based on machine learning.The first one is based on eigencredits and support vector data description (SVDD). We map all the samples into a feature space spanned by the Eigencredits which are the principal components of the creditworthy vectors. In the feature space, the creditworthy vectors gather more tightly. Then we use SVDD to model the creditworthy samples. This model performs well when negative samples are too few or too many. Experimental results show that eigencredits and SVDD model is effective when the number of bad credit samples is small.The second one is based on weightly selected attribute bagging (WSAB) which is a new ensemble learning model. WSAB modelling can be devided into two steps. First, Attributes'weights are computed using some attribute evaluation method. Then the attribute subsets are constructed according to the attributes'weights. For each of attribute subsets, the attributes with larger weights have larger probabilities to be selected into the attribute subset. Further, training samples and test samples are projected onto each attribute subset respectively. Single scoring models are constructed based on newly-produced training samples, and all the single scoring models are used to vote for test instances. The individual classifier that only uses selected attributes can become more accurate because some redundant and uninformative attributes can be eliminated. Besides, the way of selecting attributes by probability can also guarantee the diversity of ensemble. Experimental results show that WSAB can improve the performance of individual classifier better than other ensemble learning methods.The third one is based on kernel matching pursuit (KMP) and KMP ensemble. KMP algorithms learn a function that is a linear combination of functions choosing from a kernel-based basis function dictionary, by sequentially appending basic functions to an initial empty basis using a greedy optimization algorithm, to approximate a given function. KMP ensemble constructs a collection of several KMP classifiers that are independent of each other yet accurate, and then classifies a new instance by combining their predictions. Experimental results show that KMP is an excellent credit assessment model with some advantages such as higher accuracy, less training time and sparser resolution. Besides, KMP ensemble can improve the classification performance of single KMP remarkably when facing a large volume of dataset.
Keywords/Search Tags:credit risk assessment, eigencredits, support vector data description, weightly selected attribute bagging, kernel matching pursuit
PDF Full Text Request
Related items