Font Size: a A A

Research On GA-based Subspace Classification And Its Application To Personal Credit Evaluation

Posted on:2015-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:H R JiangFull Text:PDF
GTID:2298330452959411Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
With the development of information technology, the bank faces the problem ofanalyzing and processing large-scale, high-dimensional complex data in the personalcredit risk assessment. It is crucial for personal credit risk assessment system toestablish a good risk grade classification model based on the users’ personalinformation, credit records and consumption records. However, modeling based onmassive data needs much time and memory, and the classification model is usuallycomplex and less efficient. Furthermore, data points in the sample space becomemuch sparser when the feature size of sample is increased, which results in theinvalidation of many measurement methods based on distance. In addition, many ofthese features are either redundant or irrelevant to the classification target and mightaffect the performance of the model.To solve above problems, this thesis presents a new Genetic algorithm-basedSubspace classification algorithm for SVM (GS-SVM). A modified sample selectionmethod is developed to select a subset of training data based on both the confidenceand the convex hull. Then the representative samples are selected to train the SVMmodels by considering the distances between classes and the sample distribution. Thealgorithm adopts the matrix-form mixed encoding. Genetic algorithm is used tooptimize the feature subspace of representative samples and the classificationparameters of SVM simultaneously. The SVM classification model is produced basedon the representative samples with the optimized feature subspace. Experimentalresults on ten UCI datasets illustrate that the proposed algorithm is able to select bothsmall sample subsets and refined feature subspaces, and outperform the traditionalclassification algorithms. Finally, experiments are conducted on the typical personalcredit assessment dataset and the results show that the proposed algorithm can notonly achieve high classification accuracy, but also identify prominent featuresubspaces that provide sufficient information to fulfill the classification tasks.
Keywords/Search Tags:Subspace classification, Support vector machine, Geneticalgorithm, Sample selection, Personal credit assessment
PDF Full Text Request
Related items