Font Size: a A A

Prediction Using Artificial Neural Network Model Of Essential Hypertension

Posted on:2011-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2144360305458487Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Prediction using artificial neural network model of essential hypertensionPrefaceEssential hypertension (EH) is one of the common cardiovascular disease, In recent years, as the economic development pace of life is significantly increased, resulting in a series of unhealthy lifestyles, leading our country mortality, morbidity and prevalence of cardiovascular disease continued to rise. Hypertension not only is an independent disease, but also is risk factors leading to important cardiovascular disease, even the event of serious complications such as hypertensive crisis and hypertensive encephalopathy may be life-threatening. Therefore, prevention and control of hypertension can not be ignored.Research has shown that hypertension is a multifactorial disease, Large number of risk factors and complexity of relationship between various factors is an important feature of hypertension. Currently, method of disease prediction is mainly traditional Logistic regression(LR),but Logistic regression model require variables must satisfy the independence and can not deal with the problem of collinearity between the variables. Therefore, using logistic regression model to predict such a complex disease, high blood pressure, has some limitations. Artificial Neural Networks(ANNs), referred to as neural networks (NNs), is a mathematical model of simulating the biological neural network to information process. Neural network has the strong ability to solve the collinearity effect and the interaction between variables, and are good at handling non-linear, fuzzy, noisy data case. Currently, artificial neural network applications in medicine is far less widespread than the traditional Logistic regression.The selected scene in this study is Zhangwu County in Liaoning Province. By investigation, the standardized prevalence hypertension rate was 35%, national rare. In this study, we used of these survey data set up a back propagation ANNs (BPANNs) prediction model, comparing with the Logistic regression model, and evaluated the forecast performance of ANNs by receiver operator characteristic curve(ROC curve). We also studied and evaluated the ANNs for the prediction effects and characteristics, to explore new prediction ways for the complex diseases such as high blood pressure and provide a reference for prevention and treatment of hypertension in rural areas.Subjects and Methods1.The selection of study subjectsThis study used the survey data which came from the epidemiological investigation in Zhangwu County in Liaoning Province before to statistics and forecast analysis.Using clustering multistage sampling method 5208 people were total surveyed, at last 4126 respondents over 30 years old were enrolled in this study, of which women were 1942 people, men were 2184 people.2.The contents of investigation and measurement indicatorsQusetionnaires were filled by means of inquiring and measurement in sites, the contents of survey included:general characteristics, smoking habits, alcohol intake and so on. Measure blood pressure, body height and weight, et al.Five millititers blood samples were drawn after an overnight fast. After centrifugation, the serum fraction was removed and frozen in aliquots until assayed.3.Diagnosis standard and measurement methodsThe diagnosis standard of EHT:According to 1999 WHO-ISH guidelines for the management of hypertension, hypertension was defined as a systolic blood pressure (SBP)≥140mmHg and/or a diastolic blood pressure (DBP)≥90mmHg. The measurement of blood pressure should be carried out according to the unified standard under standard conditions.Cholesterol, triglyceride, high density lipoprotein (HDL), low density lipoprotein (LDL), serum sodium, serum potassium, serum iron, serum calcium were measured by automatic biochemistry analyzer 7150 (HITACHI, Japan), the blood sugar was measured by blood sugar analyzer (Johnson & Johnson, America).4.The establishment of ANNsANNs model used the three layers BP neural network model with a hidden layer. Input layer neurons of the model were the factors related to hypertension and P<0.05 by Univariate analysis, output layer had one neuron(that was studied whether hypertension according to diagnostic criteria), and number of neurons in the hidden layer through the experiment was merited to determine basing on the mean square error. The hidden layer activation function was tansig, and the output layer activation function was logsig.The data(4126 cases) according to the ratio of 3:1 after balancing by gender and age were randomly divided into the total set of training (3096 cases) and test set(1030 cases), and were respectively used to set up and test, In order to prevent over fitting the total set of training according to the ratio of 3:1 were randomly divided into train set (2334 cases) and check set(762 cases), using check set from time to time to check the results of training.5.Statistical methodsThe ANNs prediction model of hypertension was created by Matlab7.1 software,the Logistic regression prediction model was created and ROC Curve was draw by spss13.0. Criteria for predicted probability was 0.5, that is, when p≥0.5 predicted infestation of hypertension, or high blood pressure was not. A 2-sided value ofα=0.05 was regarded as statistically significant.Results1. Prediction of hypertension using unconditional single factor Logistic regression modelUnivariate analysis of hypertension was conducted for the survey data. The factors that is p<0.05, A total of hypertension-related factors is 22,was selected and taken as input variables predictive model. 2. Prediction of hypertension using multivariate non-conditionalLogistic regression model(1)The establishment of multi-factor non-conditional Logistic regression modelThe total set of training (3096 cases) was carried out multivariate non-conditional Logistic regression analysis.the chosen indicators by single factor analysis served as independent variables(Height, weight has been transformed into BMI, so did not enter the model), and whether subjects are suffering from high blood pressure served as the dependent variable, and in this way a multi-factor Logistic Regression Model was set up. Model used the maximum likelihood estimation method, forward stepwise regression analysis, the Access criteria selected variables is p<0.05, the Exclusion criteria selected variables was p>0.10. After stepwise regression,9 factors enter the model, in Omnibus Tests of Model Coefficients the step wasχ2=4.335, and the model test wasχ2=1439.457. the consistency rate of the total set of training was 78.42%, specificity was 80.45%, sensitivity 76.62%.(2)The prediction Using multi-factor non-conditional Logistic regression modelThe subjects of test set(1030 cases) were predicted whether they were suffering from high blood pressure by the Logistic Regression Model.The predicted results was that the consistency rate was 77.48%, specificity was 80%, sensitivity 74.85%. 3. The prediction of hypertension using BPANNs(1)The establishment of BPANNsBPANNs was a three-tier model, the 22 chosen indicators by single factor analysis served as input variable, there were 22 hidden layer neurons in the hidden layer, there was one neuron in output layer (whether was the risk of EH).the target error took 0.01, and learning rate took 0.1, the maximum training period took 2000. After 17-step training, then training meaned square error MSE was 0.126262, gradient Gradient was 137.276/1e-010, The network training ended, when test set to the minimum mean square error, fitting results of that test trained BPANNs model was, the consistency rate of the train set,the check set was respectively 81.06%,and 77.95%, and the consistency rate specificity and sensitivity of the total set was respectively 80.30%,84.48%, 76.16%.(2)The prediction Using of BPANNsThe subjects of test set(1030 cases) were predicted whether they were suffering from high blood pressure using the BPANNs.The predicted results was,the consistency rate of the test set was 78.83%, specificity was 81.57%, sensitivity 76.42%.4.The comparison between BPANNs and Logistic regression model about predictive ability of high blood pressure(1) Comparison of predicted resultsThe consistency rate, sensitivity and specificity of Neural network model were higher than Logistic regression model.(2) Comparison of ROC curve areaThe ROC curve of BPANNs and Logistic regression mode were drawn, The results showed that, the area under ROC curve of Logistic regression model was 0.782,95% CI is[0.768,0.797], the area under ROC curve of BPANNs was 0.800,95% CI was [0.786,0.814].DiscussionCauses of hypertension are complex, and the risk factors of affecting hypertension are in many aspects.Some risk factors may exist many interactions, multicollinearity. These complex relationships influence the predictive model fitting, and seriously disturb the prediction and high blood pressure research. Therefore, this study, using of these survey data from Zhangwu county in Liaoning Province,set up a back propagation ANNs (BPANNs) prediction EH model, and compared with the Logistic regression model, and evaluated the ANNs for the prediction effects and characteristics. In the process of building neural network model, there was no uniform standard to set the function and parameter, so we need to analyze specific issues. In this study, the model was the BP neural network, known as a "feed-forward back-propagation network", which was the most widely used in the medical field and embodies the essence of neural networks. Because any continuous function in closed interval could be closed by single hidden layer BP ANNs, so this study used three layers (containing a hidden layer) BP neural network. Taking into account that excessive number of neurons requires a higher sample size, so only selected factors closely related to high blood pressure as input variables, that was, p<0.05 in univariate analysis. For the multi-categorical variables in the input variables (such as national) we set the dummy variable to facilitate better use of data. Number of neurons in the hidden layer and training function was determined according to the test. Test showed that compared to other values, when the number of neurons was 22 and training functions are trainlm, the mean square error was small and stable, the initial weights of the network was set to (01) interval of random numbers.Since if the initial value of different then ANNs model was different,so after numerous experiments, the best model selected. In order to avoid over-fitting, we used chedk set to supervise training at intervals in the training process.In this study, the consistency rate,, sensitivity and specificity of the neural network model were higher than Logistic regression model. The consistency rate of Logistic regression model and neural network model was respectively 77.48% and 78.83%, that could see the predictive ability of neural network model was better than Logistic regression model.Using ROC curve to evaluate the effectiveness of two models, AUC area under the curve of Logistic regression model and ANNs AUC were respectively 0.782,0.800. it also suggested that neural network model fitted slightly better than LR for these diseases such as hypertension of which risk factors were many and complex relationship exists between the various factors.Neural networks were still some issues to be resolve. First of all,the establishment of neural networks changes with setting parameters, functions, initial value, etc. The correctness of these settings is still a lack of theoretical basis,so that only rely on experience and testing to determine. Second, there is no recognized principle of access and remove as a Logistic regression model about the input variables of neural network. Again, the medical explanation of the role of various factors on the dependent variable was not clear,and hypothesis testing method,confidence intervals and other issues need further study.ConclusionExperiments showed that such a complex disease for high blood pressure, neural network prediction model performs was better than the Logistic regression model. Therefore, ANNs could be used as a necessary complement for Logistic regression model, neural network prediction in complex diseases had broad application prospects.
Keywords/Search Tags:Neural network, Hypertension Forecast, Logistic regression
PDF Full Text Request
Related items