Font Size: a A A

A Study Of Feature Selection Method Based On Support Vector Machine And Its Application

Posted on:2007-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:L JiangFull Text:PDF
GTID:2178360212975645Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Statistical learning theory is a theory of machine learning law dealing with small samples, and it takes into account the requirement of the generalization ability and tries to optimize final results in limited conditions. Based on statistical learning theory,a new machine learning method--support vector machine (SVM) is put forward recently. SVM has some advantages over previous machine learning methods in solving problems of pattern recognition, such as small samples, high dimensionality and non-linearity.A new classification-based feature selection algorithm was proposed in this study, named Feature-Selection. This algorithm aims to explore the best subset of features for classification from a group of either irrelevant or relevant features. What' s more, it can systematically prioritize all features based on degree of correlation between them and categories. Importantly, this new algorithm was used to identify a set of combined-risk factors for type II diabetes in this study. A best subset of risk factors,consisting of waistline, waistline /hip-girth, diastolic blood pressure and age, was found for this disease. The sensitivity, specificity and accuracy of SVM classification under this subset are 0.8666, 0.6420 and 0.7014 respectively. In addition, we compared performance of SVM and two other classification methods, Decision Tree and Multilayer Perceptron for risk factor selection in the type II diabetes sample. It turns out that SVM was superior to the other two. Therefore, it suggests that the SVM-based feature selection algorithm is efficient method to select the best subset of features for classification and identification. And a comparison between the Feature-Filtrate algorithm and principal component analysis was also conducted. It turns out that the former is superior to the latter for the extraction of features. But the method mentioned above is limited to binary-class classification. And then it was extended to handle problems of multi-class classification by introducing decision tree.This study ended up with the development of a Java-based application to carry out a system, titled "the forecast system of type II diabetes". The system is able to manage information of patients, customers, and so on. The Feature-Filtrate algorithm was built in the system as a powerful data mining method, on which the customers can easily estimate their type-II-diabetes-specific health state. And it likely contributes to the prognosis, diagnosis, prevention and treatment of type-II-diabetes. In addition, the system may also facilitate the popularization of relevant medical knowledge for some medical organizations...
Keywords/Search Tags:SVM, Feature-selection, Multi-classsification, Type II diabetes, Forecast system
PDF Full Text Request
Related items