A Study Of Feature Selection Method Based On Support Vector Machine And Its Application

Posted on:2007-01-12

Degree:Master

Type:Thesis

Country:China

Candidate:L Jiang

Full Text:PDF

GTID:2178360212975645

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Statistical learning theory is a theory of machine learning law dealing with small samples, and it takes into account the requirement of the generalization ability and tries to optimize final results in limited conditions. Based on statistical learning theory,a new machine learning method--support vector machine (SVM) is put forward recently. SVM has some advantages over previous machine learning methods in solving problems of pattern recognition, such as small samples, high dimensionality and non-linearity.A new classification-based feature selection algorithm was proposed in this study, named Feature-Selection. This algorithm aims to explore the best subset of features for classification from a group of either irrelevant or relevant features. What' s more, it can systematically prioritize all features based on degree of correlation between them and categories. Importantly, this new algorithm was used to identify a set of combined-risk factors for type II diabetes in this study. A best subset of risk factors,consisting of waistline, waistline /hip-girth, diastolic blood pressure and age, was found for this disease. The sensitivity, specificity and accuracy of SVM classification under this subset are 0.8666, 0.6420 and 0.7014 respectively. In addition, we compared performance of SVM and two other classification methods, Decision Tree and Multilayer Perceptron for risk factor selection in the type II diabetes sample. It turns out that SVM was superior to the other two. Therefore, it suggests that the SVM-based feature selection algorithm is efficient method to select the best subset of features for classification and identification. And a comparison between the Feature-Filtrate algorithm and principal component analysis was also conducted. It turns out that the former is superior to the latter for the extraction of features. But the method mentioned above is limited to binary-class classification. And then it was extended to handle problems of multi-class classification by introducing decision tree.This study ended up with the development of a Java-based application to carry out a system, titled "the forecast system of type II diabetes". The system is able to manage information of patients, customers, and so on. The Feature-Filtrate algorithm was built in the system as a powerful data mining method, on which the customers can easily estimate their type-II-diabetes-specific health state. And it likely contributes to the prognosis, diagnosis, prevention and treatment of type-II-diabetes. In addition, the system may also facilitate the popularization of relevant medical knowledge for some medical organizations...

Keywords/Search Tags:

SVM, Feature-selection, Multi-classsification, Type II diabetes, Forecast system

PDF Full Text Request

Related items

1	Diabetes Clinical Medical Record Information Management System Design And Development Applications
2	Research On Prevention And Treatment Of Type 2 Diabetes Mellitus Based On Data Mining
3	Research On Semi-supervised Feature Selection Model And Algorithm For Mixed-type Data
4	Bayesian Personalized Ranking Model With Multi-type Implicit Feedback Confidence
5	Research On Feature Analysis And Energy Consumption Forecast Of Server In Data Center
6	The Modified K-MEANS Algorithm And Its Application To Type-â… Diabetes Glucose Data Clustering
7	Feature Selection Mechanism For Multimodal Social Media Data With Privacy Protection
8	Research On Multi-Type Sensors Oriented Fingerprint Image Segmentation
9	Visual Object Tracking Based On Feature And Model Selection
10	Research On Embedded Multi-label Feature Selection Algorithm