Font Size: a A A

Prediction Methods Study Of Tuberculosis Infection Monitoring Data Of HIV/AIDS In Shanxi Province

Posted on:2010-04-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F ZhaoFull Text:PDF
GTID:1114360275461744Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
ObjectivesCooperated with the Center of Disease Control and Prevention of Shanxi Province, tuberculosis data came from the Chinese Center of Disease Control and Prevention(CDC) network monitor system , and TB / HIV co-infection data came from the investigation of 5 counties in Yuncheng city based on the fifth tuberculosis global fund project. Aimed at establishing the TB / HIV co-infection monitoring evaluation system, To provide scientific basis for trend prediction and control measures of the TB / HIV co-infection. To find out the severity of the tuberculosis infection and its influencing factors in HIV/AIDS patients using statistical analysis methods.MethodsSurvey and monitor were carried on TB/HIV co-infected patients from the 5 counties of Yuncheng city, including Ruicheng, Xiaxian, Xinjiang, Jiangxian and Jishan. The follow-up table coincident with the actual situation of co-infection of Shanxi was prepared and revised. Collected data were analyzed with rare event logistic regression, random effect logistic regression, bayesian estimation and so on. Programming was realized with software of SAS 9.1.3 and Stata 10.0.Collecting and comparing the data of the smear-positive tuberculosis patients from tuberculosis Information Management System database of CDC in Shanxi Province. Time series analysis (ARMA and ARIMA model) and Microsoft SQL Server Analysis Services data mining model - Microsoft Time Series algorithm were used to predict the trend of tuberculosis incidence of Shanxi province.Results1. Due to the tuberculosis Information Management System database of CDC could not included the information of TB / HIV co-infection,this article will make the database perfect, added the Follow-up survey form, Death registration form,incomes, nutritional status and so on.Based on the tuberculosis Information Management System database of CDC in Shanxi Province. To provide the baseline information for the prevention and treatment of TB/HIV co-infection.2. If the frequency difference of dependent variable between two types of values was disparate, the classical logistic regression might underestimate the probability of rare events.Therefore, we adjusted parameters and the estimated value of the probability to solve such problems. Examples in this article, the following methods were used,including the classical logistic regression, logistic regression prior correction, logistic regression MCN prior correction, logistic regression weighted correction and logistic regression MCN weighted correction.The Vuong test was used to compard among different models,the results showed that the logistic regression MCN weighted correction was fit model respectively. Maximum-likelihood estimation, weighted maximum-likelihood estimation, approximate unbiased estimation, approximate bayesian estimation were used to estimate Probability. The results showed that approximate bayesian estimation results optimal. According to Approximate Bayesian estimation, the tuberculosis infection rate of HIV / AIDS patients was about 0.05 in Shanxi Province.3. In this investigation, if groups effects among tuberculosis infection rate of HIV / AIDS patients in five counties were considered, there was more individual similarity, then the tuberculosis infection rate of HIV / AIDS patients were non-independent in the same region. Generalized linear mixed-effects model was used to establish logistic regression random effects for the data of HIV / AIDS patients with tuberculosis infection to solve the non-independent problem for the data of HIV / AIDS patients with tuberculosis infection. Examples in this article, the following methods were used the classical logistic Regression, including logistic Regression priori correction, logistic Regression MCN priori correction, logistic Regression weighted correction and logistic Regression MCN weighted correction to fit model respectively. The goodness-of-fit indicators showed that rare event random-effects logistic regression with weighted MCN correction model was better for data fitting. The levels of CD4 would affect the probability of HIV / AIDS patients with tuberculosis incidence. The logarithm of CD4 values changes one unit, The risk lower by 74.9 percent of tuberculosis infection in HIV / AIDS patients.4. Generalized linear mixed effect model required numerical integration for joint likelihood function or approximation of the model, and the restricted pseudo-likelihood was the first-order Taylor approximation of generalized linear mixed-effect model. We introduced bayesian estimation into generalized linear mixed-effect model, and the results showed that the posterior estimates were closed to the restricted pseudo-likelihood estimation if the priori noninformation was selected .5. The results showed that the model of ARIMA (1,1,0) (1,1,0)12 was better for fitting the data model. The average absolute error was 136.64 between prediction value and actual value, with an average relative error 8.10%. The forecasting results from 2009 showed that the cases of smear-positive in 2009 was much lower than that of previous years, climbing up from March to August, highest in April in 2009.The results of the Microsoft Time Series algorithm was consistented with the model of ARIMA (1,1,0)(1,1,0)12.6. Compared with the history predict results of ARIMA model, The average absolute error of Microsoft Time Series algorithm was 116.7, average relative error was 6.60 percents from January 2007 to August 2008, while the average absolute error of ARIMA model was 104.4, average relative error was 5.90 percent, the relative prediction error of ARIMA model was lower than Microsoft Time Series algorithm.Conclusions1. The content of TB/HIV co-infection was added to the revised form of tuberculosis monitor, it could evaluated the treatment effect of tuberculosis better, controlled TB/HIV co-infection more effective, provided theoretical basis for establishing cooperated pattern of TB and AIDS.2. The rare event logistic regression was superior to the classical logistic regression in the rare event analysis, it was worthy of promoted and applied for the rare disdeases. Vuong test was used to evaluated the different regression models.3. Under the bayesian assumption,noninformative priors were specified for the parameters in generalized linear mixed-effect model, applying MCMC for parameter estimation,the estimators were consistent with the results that had been gained by using restricted pseudo-likelihood. Bayesian models provided us an effective alternative for nonlinear mixed-effect model,because bayesian estimation did not rely on the asymptotication and approximation, it was more accurate and natural than restrictive pseudo-likelihood estimation under classical statistic. Especially the support of implementation software for bayesian analysis, bayesian models had much more attractive advantages. 4. Time-series model (ARMA model) could easily dealed with the smooth sequence , and fitting better, it is a practical mathematical model and prediction instrument for infectious diseases prediction, especially tuberculosis. Appropriate analytical model was the key of prediction effect.5. The Microsoft Time Series algorithm was introduced to the medical timimg data analysis firstly, we constructed a training model for the the cases of smear-positive in Shanxi Province from January 2005 to December 2008. It combined with self-regression and decision tree technology, enriching the prediction method of medical time-series data. Although Microsoft time sequence algorithm prediction error slightly larger than ARIMA models in this case, the robustness of the prediction effect need to continue to explore, but the principle was simple and easy to comprehend and operate, it was worthy of promoted as a new predicted algorithm.
Keywords/Search Tags:Tuberculosis(TB), HIV/AIDS, Rare event logistic regression, Bayes Statistics, Time series, Microsoft time series algorithm
PDF Full Text Request
Related items