Font Size: a A A

Comparison Of Data Mining Methods Based On The Data From A Home Environment Test Of Patients With Parkinson’s Disease

Posted on:2015-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:B W YangFull Text:PDF
GTID:2308330464951878Subject:Statistics
Abstract/Summary:PDF Full Text Request
This paper is to apply the data mining methods to predict the symptom severity and cause in data from a test battery for Parkinson patients and compare them in order to find the best model. We have two data sets, one is the cause data set, and the other is severity data set. We apply classification methods on cause data set, while we apply prediction methods on severity data set.We apply four different data mining methods for classification. They are Decision Trees, Random Forests, SVM and KNN method. Decision Trees only use 9 input variables from total 18 input variables and have high accuracy rate. Random Forests use all 18 input variables and its accuracy rate is always higher than Decision Trees. SVM have high accuracy rate, but it is still lower than Decision Trees and Random Forests. The accuracy rate of KNN method is the lowest among the four data mining methods. The accuracy rate of Random Forests is the highest among the four data mining methodsWe apply four different data mining methods for prediction. They are Decision Trees, Random Forests, GLM and MLP method. Decision Trees only use 6 input variables from total 18 input variables and have high accuracy rate. Random Forests use all 18 input variables and its accuracy rate is always higher than Decision Trees. GLM is statistical modeling method and have high accuracy rate. MLP is very famous artificial intelligence method and the accuracy rates are always high. The accuracy rate of MLP is the highest among the four data mining methods...
Keywords/Search Tags:Data mining, Decision tree, Support vector machine, Parkinson’s disease
PDF Full Text Request
Related items