Credit Risk Assessment Based On Machine Learning

Posted on:2012-06-06

Degree:Master

Type:Thesis

Country:China

Candidate:H Z Wei

Full Text:PDF

GTID:2218330362953625

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Credit risk assessment is crucial for financial institutes. Machine learning algorithms can significantly improve the accuracy of credit risk assessment. In this thesis, we propose three new credit risk assessment models based on machine learning.The first one is based on eigencredits and support vector data description (SVDD). We map all the samples into a feature space spanned by the Eigencredits which are the principal components of the creditworthy vectors. In the feature space, the creditworthy vectors gather more tightly. Then we use SVDD to model the creditworthy samples. This model performs well when negative samples are too few or too many. Experimental results show that eigencredits and SVDD model is effective when the number of bad credit samples is small.The second one is based on weightly selected attribute bagging (WSAB) which is a new ensemble learning model. WSAB modelling can be devided into two steps. First, Attributes'weights are computed using some attribute evaluation method. Then the attribute subsets are constructed according to the attributes'weights. For each of attribute subsets, the attributes with larger weights have larger probabilities to be selected into the attribute subset. Further, training samples and test samples are projected onto each attribute subset respectively. Single scoring models are constructed based on newly-produced training samples, and all the single scoring models are used to vote for test instances. The individual classifier that only uses selected attributes can become more accurate because some redundant and uninformative attributes can be eliminated. Besides, the way of selecting attributes by probability can also guarantee the diversity of ensemble. Experimental results show that WSAB can improve the performance of individual classifier better than other ensemble learning methods.The third one is based on kernel matching pursuit (KMP) and KMP ensemble. KMP algorithms learn a function that is a linear combination of functions choosing from a kernel-based basis function dictionary, by sequentially appending basic functions to an initial empty basis using a greedy optimization algorithm, to approximate a given function. KMP ensemble constructs a collection of several KMP classifiers that are independent of each other yet accurate, and then classifies a new instance by combining their predictions. Experimental results show that KMP is an excellent credit assessment model with some advantages such as higher accuracy, less training time and sparser resolution. Besides, KMP ensemble can improve the classification performance of single KMP remarkably when facing a large volume of dataset.

Keywords/Search Tags:

credit risk assessment, eigencredits, support vector data description, weightly selected attribute bagging, kernel matching pursuit

PDF Full Text Request

Related items

1	Optimal Kernel Methods
2	Research On Credit Risk Assessment Of Listed Companies Based On Improved Rough Set And Support Vector Machines
3	Design And Implementation Of Bank Credit Risk Assessment And Decision Support System
4	The Application Of Ensemble Support Vector Machine Based On Bagging Algorithm In Personal Credit Rating
5	Support Vector Data Description Of The Application Of The Opposition Point Detection
6	An XGBoost-Based Ensemble Learning Approach To Personal Credit Risk Assessment
7	Attribute Reduction And Classification Decision Based On Support Vector Data Description
8	Radar Target Identification Method Based On Multiple Kernel SVDD
9	Research On Some Problems And Applications In Support Vector Data Description
10	Improved SVM-KNN Model On Credit Risk Assessment