Font Size: a A A

Research On Mining Common Risk Factors Of Multi-diseases And Predicting Disease

Posted on:2014-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:X M DiFull Text:PDF
GTID:2298330467953109Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, many countries are actively trying to promote the reform and accelerate the development of medical and health undertakings.The continuous development of informationization in healthcare plays a key role in decision-making technical support, which also promotes the effective fusion of information technology and diagnosis and treatment norms and daily supervision. With the development of medical informationization process, the medical industry has got too much unstructured data. Therefore, data mining techniques can be applied in the medical and health fields to find the hidden knowledge behind such medical data. The use of this knowledge and rules in clinical can support medical decisions and provide auxiliary medical scheme to hospital doctors. The knowledge formed by data mining has been confirmed effective not only in predicting chronic diseases and improving medical decision-making, but also in reducing medical treatment accidents and improving the recommended standards of care and health service quality and efficiency. This paper first introduces the common methods of data mining, which are used to obtain risk factors of diseases. Considering the complexity of diseases, the patients are not attacked by only a kind of disease, but may be associated with the risk factors of other complications. According to the patient’s disease characteristics and clinical diagnosis and treatment, taking the data sets of hypertension and hyperlipidemia patients for example, we use several kinds of data mining methods to get the common risk factors for hypertension and hyperlipidemia diseases, which can effectively reduce the data dimension and improve the diseases prediction accuracy without any increase in price.For the past few years, the study found that compared with single classifier, integrated classifier had showed great advantages. We chose nine standard data sets from UCI and conducted an experiment in the Weka data mining environment. Finally we find that integrated classifier improve the classification accuracy compared with single classifier, and this integrated classifier which taking C4.5and REPtree as training base classifiers do better than that which taking Decision stump as training base classifiers.Considering the advantages of integrated classifier in classification prediction, so the classifiers integration technology is used in predicting diseases, aimed at improving prediction accuracy of the disease. Take hypertension and hyperlipidemia for example, we can diagnosis whether the patient is suffer from hypertension or hyperlipidemia or two kinds of disease at the same time under the help of integrated classifier. According to the predicting results, the doctor can modify the diagnosis and treatment plan. On the Weka plat, integrated classifiers are evaluated by the consistency and error of classifiers besides the AUC values for forecasting hypertension and hyperlipidemia. We find that the overall performance of integrated classifiers are better than single classifier when predicting hypertension and hyperlipidemia and the C4.5integrated classifier is the best.
Keywords/Search Tags:data mining, integrated classifier, Weka, hypertension, hyperlipidemia
PDF Full Text Request
Related items