Font Size: a A A

Research On The Diagnosis Of Heart Disease Based On Data Mining Technology

Posted on:2019-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q YueFull Text:PDF
GTID:2334330548452303Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recent years,death caused by cardiovascular disease has continued to be the leading cause of human death: patients suffer from heart disease are younger and younger.Therefore,the most urgent and vital medical research is diagnosis and treatment of this disease.There are varieties influence factors,how to improve the diagnostic efficiency is one of the very important problems.This article is based on the 298 heart disease cases data in the UCI machine learning database,using BP neural network algorithm,support vector machine and random forest to build a heart disease classifier.From the comparison results,the best one was found.Classification performance is improved by integration support vector machine,it will help doctor accurately diagnoses.Main contents of this article as follows:(1)Fully understanding diagnostic indicators of heart disease and choosing the proper indicators as the objects in this study.According to research,there are75 indicators,after study,finally choosing 14 indicators for this study.(2)Preprocess the data of heart disease indicators.Firstly,delete missing values in all properties to complete the data.Secondly,transform the data format conversion into ARFF format document which supporting by WEKA.Then,based on the different algorithms,normalized all the properties form data of heart disease indicators.At last,build a classifier by using different properties selection methods for screening out relevant properties.(3)Modeling the data of heart disease indicators.Build a classifier by using BP neural network algorithm,support vector machine and random forest,respectively.Different arithmetic has different parameter requirements,choose the best parameter combination to achieve the best performance of the classifier.(4)Through evaluation analysis of the three classifiers,it can optimize the best classifier.Choosing the best classifier by compared with modeling time,interpretability,error and costs.From evaluation analysis,support vector machine classifier performs the best classification performance.Then optimizethis classifier by using Bagging arithmetic to single support vector machine,and the performance of classifier is further improved.The experiment result shows that the best classifier is support vector machine classifier,the classification accuracy rate is 84.8993%,nevertheless,the classification accuracy rate of BP neural network classifier and random forest classifier are 78.1879% and 77.5168%.The result of optimizing the classifier with Bagging arithmetic are 0.9% increasing in ROC Area value,0.54%decreasing in root mean squared error,1.0905% decreasing in root relative,compared with single support vector machine classifier,the performance of integration support vector machine classifier has a significant promotion.
Keywords/Search Tags:data mining, heart disease, BP neural network algorithm, support vector machine, random forest
PDF Full Text Request
Related items