Font Size: a A A

Data Mining For Applied Research In The Diabetes Data

Posted on:2004-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:H ChengFull Text:PDF
GTID:2208360095456166Subject:Computer applications
Abstract/Summary:PDF Full Text Request
With the progress and development of the socio-economic level, the disease tree, which threatens human health, is changing. The chronic non-infectious disease is becoming the primary disease that threatens human being, especially for aged people. The most significant example is type 2 Diabetes Mellitus cases that are increased rapidly recently in the world. The growing of age increases the incubation period and illness rate of type 2 Diabetes Mellitus cases, which indicates that this disease is a kind of progressive Diabetes Mellitus. The best way to get a better control of this disease is to understand and investigate it comprehensively. There are some approaches that help to have a perfect treatment for type 2 Diabetes Mellitus ?to understand the pathogeny and pathologize of this disease, to take control of the routes of infection and to know well of the key illness group. Discovering the pathogeny and pathologize of type 2 Diabetes Mellitus is the most important and fundamental step to control the morbidity.The traditional research to control the non-infectious disease, such as Diabetes Mellitus, was used to be the Linear Reduction method. However, from the point of view in nowadays, this method is with some considerable restrictions. For the restrictions of the traditional research, we try to use Bioinformatics method to discovery the regularity in this article. As a main technology in Bioinformatics, in this article we introduce Data Mining in the pathogeny research of type 2 Diabetes Mellitus, in order to obtain the knowledge of type 2 Diabetes Mellitus pathogeny, to discover the required data and rules, and then to structure the classification and prediction system of Diabetes Mellitus.The source data of Diabetes Mellitus originates from the health examination reports on patients and random sampling. With the appropriate transforming the data in the health examination reports and storing the data in the database, we can get the source data. For the incompleteness, noisiness and inconsistency in these data, we use some preprocess technologies of Data Mining, such as Data Cleaning, Data Transformation and Data Reduction, to process these source data.The Data Mining task is to find the illness regularity from huge Diabetes Mellitus data, to organize the decision system to prevent, diagnose and predict the Diabetes Mellitus. Depending on the mining-mission classification and the mining algorithm requirement, we choose Decision Tree method to do data mining. Also, for the continuity in Diabetes Mellitus data, choosing the C4.5 algorithm in Decision Tree method to be the data mining algorithm.On the basis of the implement of C4.5 algorithm, we learn the knowledge ofillness regularity and rules from Diabetes Mellitus data, and generate a set of rules of Diabetes Mellitus diagnostics and prediction depending on the preprocessed Diabetes Mellitus data. In addition, with the holdout method in classification to determine the accurate rate.Because the accurate rate of illness group derived from the Decision Tree above is not precise enough, we bring up the ratio of the training set as the parameters to test the accurate rate of illness group and test the variation associated with the average accurate rate varied from the ratio. From this method, we can provide an improved classifier and a best solution to determine the illness group.
Keywords/Search Tags:Data mining, KDD, Diabetes Mellitus, Bioinformatics, Decision Tree, C4.5
PDF Full Text Request
Related items