Font Size: a A A

Research Of Multiple Disease Risk Prediction Based On Physical Examination Data

Posted on:2018-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y C XuFull Text:PDF
GTID:2334330515473290Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Health check is be propitious to prevent the development of these diseases.Accoording the personal health check results doctor can to manner to analyze the potential disease and give health guidance.Experienced doctors can to give individual the overall health status and disease risk analysis according to the examination results,but manual analysis methods can not meet the increasing demand for physical examination in efficiency and accuracy with the increasing number of medical data and not enough experienced doctors.Artificial intelligence and machine learning methods have been widely used in medical diagnosis and disease risk analysis with the development of data mining technology.Data preprocessing is one of the most important part of machine learning.In the medical examination data,physical examination results often exist individual differences.For a certain feature,the standard deviation of the numerical distribution of the whole population is relatively large,and the number below the mean is far more than the average number,which shows that the data distribution is extremely unstable.However,the traditional data normalization method can not avoid this problem.But we can solve this problem and improve the convergence speed and accuracy of the model to a certain extent.There are three main contents in this paper:In the first part,the FN(Fusion normalization)method is proposed to process the feature and the feature values are normalized to(0,1)intervals;In the second part,this paper establishes the combination model SVMs,GBDTs,LRs to deal with the multi-label data;The third part,according to the medical examination index of normal population is far greater than the number of abnormal population data caused by the imbalance in this paper to deal with according to the ratio of labeled data by using the method of setting different penalty factors on different labels.The data set includes 62 characteristics,such as sex,fasting blood glucose and so on,and contains 3 labels of hypertension,diabetes and fatty liver.this Data set data types are character and numeric.The experimental results show that the FN(Fusion normalization)method deals with the physical examination data in the combined model SVMs,GBDTs,LRs performance is better than no normalized data,normalized Max_min data and normalized standard data.
Keywords/Search Tags:Physical examination data, FN, Multi-label classification, Ensemble model
PDF Full Text Request
Related items