Font Size: a A A

Ensemble Based Support Vector Machine Method And Its Application In Credit Risk

Posted on:2014-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:N HanFull Text:PDF
GTID:2249330398476988Subject:Statistics
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) is a new research topic in machine learning recently. With the non-linear ability, high dimension and very friendly to small sample size, SVM has been applied to various field including Face recognition, pattern recognition, classification etc. Among the research of SVM, ensemble methods receive more and more attentions. Ensemble based classification works by aggregating many weak classifiers to get a better performance. In this paper,we research on the credit risk using the ensemble SVM.Credit risk, which plays a key role in the national economy,has been more and more important. People try to predict and manage the credit risk with numerous methods. In this paper, SVM models using two different ensemble methods are proposed to predict the credit risk.The main research contents including paper as follow:1In this paper, we briefly introduce the related theory and literature on credit risk. Firstly,we systematically introduce the present situation of credit risk assessment and support vector machine (SVM) classification and as well as relevant theoretical basis, such as the statistical learning theory, support vector machine (SVM) theory, ensemble learning theory. On the basis of summarizing the advantages and disadvantages of the previous studies, this paper proposes the ensemble support vector machine (SVM) method.2The establishment of credit risk evaluation index system and idea to model.We detailly summary the selection of credit risk evaluation index, according to the characteristics of the samples of this article, first we choose24indicators which reflect the company’s seven financial aspects.3Sample selection, model building and empirical conclusion.We Selected the73pair samples, of which are73ST companies,73normal companies. In the last we select19features as the research data from the original24indicators.The we proposed three different models, single SVM, bagging_SVM, boosting_SVM to predict the testing samples and compare the performance of them. We find that the bagging_SVM and boosting_SVM models lead to better performances than single SVM,specially have a higher accuracy on the ST companies. It is of important guiding significance in practice. We also investigate the effect of the ensemble size. We can see models having different ensemble size can get with different performance accuracy. We choose ROC curve to compare the three different models, we show that bagging SVM is the best and single SVM is the worst.This paper has two innovation points:the first one is puting forward pecific reference standard when we select the paired sample.In the previous papers,the selection of normal samples are generally choosed by random or only considering industry factors. In this paper, we propose three specific and feasible standards. Selection of matching samples should comply with the three years continuous positive profits, and consistent or similar with the risk company on company’s industry and public sector and10%floating up and down of total assets to fluctuate. The second one is that in study the ensemble models not only consider the different ensemble methods’ influence on the accuracy,but also thinking about the different number of weak classifiers’influence on the accuracy.
Keywords/Search Tags:credit risk, SVM, ensemble SVM
PDF Full Text Request
Related items