Font Size: a A A

Application Of The Tree Model In Psychosocial Epidemiology Study

Posted on:2008-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Z QianFull Text:PDF
GTID:1114360215988389Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Along with disease spectrum and the modern medicine pattern transformation, The epidemiology research scope has also experienced transformation from the infectious disease to the non-infectious disease, from acute diseases to chronic illnesses, from disease to the health.The social factor and the psychological factor were concerned by the researchers.The psychosocial epidemiology research presents the characteristic of multiple inputs vs. single out, multiple factor in multiple levels and their correlation and interaction contribute to the output in intriguing way. The traditional analysis methods meet the challenge of limitation from data's distribution and other conditions, many of them suffers from the computing ability and the explanation to the result, the information provided lacks the integrity and intuitive. Alternative methods base on data which can provide operation convenience, reliable results, integrity and intuitive explanation should be introduced.Data mining based on the tree structure model is a procedure to extract concealment, unknown, potential useful information and knowledge from massive, incomplete, noisy and fuzzily data in the stochastic practical application. By the characteristics displayed in the data of training sample, data mining find one kind of accurate description or the model for each category, which can be used in extracting model to describe important data class or to forecast the future data tendency. Benefit from the advantage brought by artificial intelligence (AI), under sufficient data and computation ability, many valuable functions can be automatically completed, which enables the researchers to concentrate on the question which had to be solved by themselves.Cognitive decline and hearing loss are popular in aged people, which can do harm to both physical and psychological health of aged individuals, with descending life quality and shortened life expectancy. To make clear the risk factors of cognitive decline and hearing loss will benefit the improvement of life quality.Objective: (1) To discover the factors which influences the cognitive function and hearing loss and the way they takes effect. (2) To compare the effect of tree structure model growing with several common algorithms. (3) To Discusses the tree model application in the material of different dependent variable type. Data source:In Cognition study, 1065 aged people without clinical evidence of dementia were selected from five communities, among them 461 were male and 604 were female. The following items were investigated: 1) Cognitive ability examination. All subjects were evaluated using five cognitive tests: Mini-Mental State Exam (MMSE), arithmetic, digit span, picture filling and block design by standardized assessment procedures. To get a complete profile of the cognitive function, 294 of them were evaluated using digit symbol, plot memory, visual attention span and spatial reasoning in addition to the former examination. In the longitudinal study, 373 of the 1065 aged people were retested using MMSE, arithmetic, digit span, picture filling and block design after twelve to sixteen months to baseline. 2) Social demographic character investigation. Age, gender, occupation, education, marriage and the disease history such as hypertension, diabetes, stroke were investigated as well as the life style variables such as smoking, drinking, reading, doing housework, etc. 3) Physical examination. Blood samples were drawn from the vein of 294 fasting participants to detect the level of blood lipid and glucose were determined using ASPCR.In Hearing Loss Study, two comminutes were sampled randomly in Taiyuan city. 371 persons above 50-year old were sampled, including 131 men and 240 women. 1) All of them were tested with binaural hearing, respectively at 0.5 kHz, 1 kHz, 2 kHz, and 3 kHz and for blood sugar, triglyceride and cholesterin after 12 hours limosis. 2) Social demographic characteristics were investigated, incluing age, gender, occupation, education, marriage and the disease history such as hypertension, diabetes, stroke were investigated as well as the life style variables such as smoking, drinking, reading, doing housework, etc. 3) Physical examination. Blood samples were drawn from the vein of 294 fasting participants to detect the level of blood lipid and glucose was determined using ASPCR.Methods: In cognition study, the score of arithmetic, digit span, picture filling and block examination were converted into binary variables using the value of the mean surpluses the standard deviation as the cut-point, and valued as 1 which represents the decline group and 2 which represents the normal group, the decline group was taken as the target category of the study. MMSE score was transformed into binary variable in the similar way, take 17, 20, 22, 23 as cut-points of different education degree in accordance with the illiterate, primary school, middle school and college or above, which was also valued as 1 and 2, which represent the decline group and the normal group respectively, and the decline group was selected as the target category of the study. Tree structure models were built with CHAID, EXHAUSTIVE CHAID, CRT and QUEST algorithms by selecting the variables transformed before as dependent variable and all the other variables as independent variables.In hearing loss study, tree models were built with CHAID method by selecting the hearing threshold of left ear, right ear, best ear, frequency of 500Hz, 1000Hz, 2000Hz as dependent variable, and other variables as independent variables.Result:In cognition study, education,marriage,sports, gender, age,cholesterol, HDL, LDL, self-sensation to health, smoking, alcohol drinking were risk factors to arithmetic, block, picture filling, number spanning examinations. Lower education, spouse was dead, less sports, female, higher age,HDL was lower than normal,LDL was higher than normal, bad self-sensation to health, smoking,alcohol drinking were risk characteristics that can lead to the target effect. Coronary disease history and Diastolic pressure were risk factors to MMSE Examination, positive Coronary disease history and higher Diastolic pressure can lead to the target effect。The index curve and the gain curve indicated that the model fit the data well.The misclassification risk of models fell in the range from 0.10 to 0.38, the total correct predictive percentage fell in the range of 72 to 92.9 percent.The tree structure models built with CRT and QUEST mathods included more variables,but some of them were not displayed in the Tree Diagrams, it was caused by the speciality of the algorithms, these algorithms use variales to substute the split variables when there were missing values. The surrogate variables often have association with the split variable. So, clues were given to the study followed.In hearing loss study, Risk factor to the hearing loss of the best ear includes age, community activity, the diastolic pressure and income. Risk factor to the hearing loss of the left ear includes age, housing situation, income and gender. Risk factor to the hearing loss of the right ear includes age and gender. Risk factor to the hearing loss to the frequency of 500Hz includes the way of riding, age, income, Blood sugar, Housing situation, Hypertension. Risk factor to the hearing loss to the frequency of 1,000Hz includes age, housing situation and gender. Risk factor to the hearing loss to the frequency of 2,000Hz includes age, income, housing situation and gender.Age is the major risk factor to the hearing loss of aged people. As the age grows, the degree of hearing loss aggravates. An exception is that the hearing loss to persons of 59 or 60 years old is lower than persons of the neighboring age. Persons with one or more of following characteristics may suffer from more serious hearing loss, such as participating in community activities frequently, higher diastolic pressure, monthly income below 200 Yuan, living together with children, the female, being used to traveling by the public vehicle, hypoglycemia or hypertension.Difference between the hearing loss of left ear and right ear is the way the two ears responses to the housing status of whether aged people lives together with children or not and the difference of gender. The study shows that the hearing loss to the frequency of 500Hz is most serious and has the most kinds of risk factors.Conclusion:In this article the research of influence factors to the senior citizen's cognition function decline and the research of influence factors to the loss of hearing were taken as two examples, comparison study of the tree structure model was conducted. The former was used to find the difference among the information provided by the trees of the same data built by different methods and the difference between the results generated by traditional methods and tree model. The latter was used to discuss the tree modeling of the same data with different pretreatment, trying to mine the information hiding behind the data from different side, and approach the truth behind the material. The way the tree model describes the result and advantage compared with traditional methods was also discussed.The information excavated by tree model agrees with traditional methods, this shows that the model fits the data well, and it is the basic standard to evaluate the modeling method. The traditional univariable analysis interpreted the influence of individual factor, reflecting the local characteristic of the object, while in the traditional multivariable analysis, to discover the effect of specific variables, one or more kind of other variables is or are taken away, this indicates that the weak interaction cannot be found this way. Under this situation, which variable should be selected or be discarded is a problem, bias was introduced inevitably. This is because the traditional methods were based on technology. Researchers have to transport existing variables into specific kinds to meet the condition of model needs, thus the information distortion or loss occurs. The explanation to the result becomes difficult.The tree model is one kind of data mining technology, it is based on data. It does better in multivariable data annalysis; the distribution of the data is not limited. The dependent variable of CHAID,EXHAUSTIVE CHAID,CRT can be continuous or categorical (ordinal or nominal), only categorical variable can be treated in QUEST algorithms. The final trees built by the CHAID and EXHAUSITIVE CHAID looks much similar when same parameters were set to build the tree model. Stability of models built by the CRT method depends on the size and the homogeneity of the sample, when the sample size is large, its performance is great. With this method, the researcher does not need to spend the massive time on pretreating the data, just introduces variables, chooses one suitable algorithm, then tree model can be built under the friendly user interface by the originally information manifestation, this enabled the information loss to achieve the most mild degree, also caused the operation complexity to fall to a very low degree.The result of tree model is consistent with alternative methods, but its description to the correlation and interaction is simpler and explanation is more intuitively, it can describe the correlation and interaction among more than two kinds of factors, to categorical dependent variable, the index of Response to the target effect can be computed, which is similar with OR value and reflects the response accumulation due to the factors and the interaction among them.The tree model method is simple and operation is easy, the result is reliable and easy to understand. The method was introduced from other research area, and performed well in psychosocial epidemiology research. So, we can infer that it may also have good performance in other research area with similar characteristics such as big sample size, multiple factors with multiple levels, correlation and interaction exists between ar among the factors.
Keywords/Search Tags:Aged people, Cognition, Hearing Loss, Tree structure model, CHAID, CRT, QUEST
PDF Full Text Request
Related items