Font Size: a A A

Diabetes Mellitus Prediction Model Based On Data Mining

Posted on:2019-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:H WuFull Text:PDF
GTID:2404330593450101Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Diabetes Mellitus(DM)is a chronic disease characterized by hyperglycemia.It has distinct family genetic properties.The International Diabetes Federation predicts in Diabetes Atlas(Eighth Edition)that by 2040,the global diabetic population will reach 629 million,which means that one of every ten adults will have diabetes.In the history of social development in China over the past thirty years,people are beginning to realize that this chronic disease affects family life and personal well-being.Currently,it is becoming a trend to obtain valuable information from health data.The rapid development of the Internet and information technology produced a large amount of data on personal health.But the large amount of data has always been lacking in effective collation,specification and utilization.How to dig out meaningful information in the data to provide reasonable advice for the prevention of diabetes has become a current problem.The analysis based on data mining can predict the development trend,and can also find the characteristic factors.The research and analysis of data mining based on diabetes health data is hopeful to become an effective solution to diabetes prevention.This paper referred the research basis of the existing diabetes prediction model,conducted data mining experiments for multiple valuable diabetes health data sets,and proposed a combined prediction model with better prediction effect and stronger applicability.First,we collect and select a large number of diabetes-related datasets.The Pima Indian Diabetes Data Set and the diabetes 1999-2008 data set of Diabetes 130-American Hospital are chosen.Reference was also made to the diabetes dataset provided by Dr.Schorling,MD.And the health data of the relevant people in our country were collected through a questionnaire survey.Second,a variety of data preprocessing techniques for data cleaning,processing and optimization are used to obtain the initial data with availability.We select K-means algorithm,Logistic Regression algorithm,Decision Tree algorithm and Random Forest algorithm to perform multiple predictive analysis experiments.The experimental results were compared and analyzed in various aspects.Therefore,a model with higher prediction accuracy rate was proposed,which was more applicable.In addition,Diabetes 130-US hospitals over the years 1999-2008 datasets are used to extract potential risk factors for secondary medical care with diabetics.
Keywords/Search Tags:Diabetes Mellitus, Machine Learning, Prediction Model, Potential Risks
PDF Full Text Request
Related items