Font Size: a A A

Application Of Decision Tree Model In The Diagnosis Of Type 2 Diabetes

Posted on:2019-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:E L MaFull Text:PDF
GTID:2334330545479977Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid growth of the Internet and the change of people's activities and the aging of the population,the incidence of type 2 diabetes in the world is increasing in accordance with the year,especially in developing countries,which will increase faster and have the epidemic situation.In the world,diabetes has become a major non communicable disease that is harmful to people's health and life safety after cardiovascular diseases and tumors.Therefore,preventing the occurrence of type 2 diabetes is of great significance for controlling the number of diabetic patients.As a representative of exploring concept structure from big data,it is a typical example of weakening model structure only using data to construct concept.Therefore,as a typical technology of data mining,decision-making tree has been widely applied.So this paper will use the algorithm of data mining ID3,C4.5 and CART to construct the decision tree model,in order to excavate the diabetic factors,and provide the theoretical basis for people to prevent and hospital clinical work.By comparing the C4.5 decision tree algorithm,the ID3 algorithm and the classified regression tree(CART)algorithm,this paper compares and compares the performance of each single algorithm and excavate the diabetes data we have collected.In this paper,the corresponding research algorithms are designed.The main data data are from 1922 cases of diabetic patients in a hospital in Qinhuangdao,Hebei province and the health body.We use 17 indexes of the overall sample collection as the main body of this study,and import the collected data set into R.In the database of the speech,the corresponding data source is established.Then,3 classical decision tree algorithms are used to set up different models for the training sample set.Finally,the test set is used to verify and evaluate the of the model trained by the decision tree.The content of this article is divided into five chapters and the detailed arrangements are as follows:The first chapter is a brief introduction to the background of the study,the significance of the study and the status of the decision tree method and the treatment of diabetes at home and abroad,as well as the main contents of the algorithms,methods and articles of this article.The second chapter explains the basic concept,algorithm description and data mining process of C4.5 algorithm in detail.The third chapter introduces the basic concepts and variables of CART,the selection of the best cut points and the processing of missing values.The fourth chapter introduces the algorithm principle,advantages and disadvantages of ID3.In the fifth chapter,the results of the three decision tree are applied to the data of this paper,and the data of the training set and test set are obtained,and the accuracy of the three models is compared and analyzed.
Keywords/Search Tags:decision tree, CART, C4.5, ID3, type 2 diabetes
PDF Full Text Request
Related items