Font Size: a A A

Depression And Its Associated Factors In Chinese Over 45 Years Old

Posted on:2021-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:P QinFull Text:PDF
GTID:2404330614468667Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective: In this study,XGBoost model and Randomforest model were used to preliminarily screen the variables related to depression symptoms in China Health and Retirement Longitudinal Study,and then logistic regression model was used to evaluate the impact of relevant factors on depression symptoms,so as to understand the related factors of depression symptoms in Chinese people over 45 years old and to provide reliable evidence for prevention and treatment.Methods:1.This study was based on a nationally representative sample of Chinese aged 45-80 from China Health and Retirement Longitudinal Study in 2011.Depressive symptoms were assessed using the 10-item Center for Epidemiological Studies Depression Scale and a cut off score of ten or higher indicates the presence of significant depressive symptoms.2.We build XGBoost model and Randomforest model based on optimal parameters.We select the top 100 variables of the two models according to their importance scores,and take the common variables of the two models as important variables.We use the selected important variables to establish logistic model,calculating the Bayesian information criteria(BIC)score and the model accuracy using 5-fold cross validation,and compare with the model established with all variables to evaluate the effect of variable selection.3.XGBoost model and Randomforest mod were conducted using scikit-learn in Python,and Logistic regression model was conducted using “glm” in R3.6.Results:1.There are 17708 samples and 3998 variables in the database.Based on the population in the blood test data,the remaining 11 data files are merged,leaving 11030 people and 3998 variables.After cleaning and sorting,8319 valid samples and 476 variables are left.2.According to the data of this study,the optimal parameters of XGBoost model are: the depth is 6,the learning rate is 0.01,and the number of trees is 10000.Parameters of random forest model: the number of trees is 10000.3.There are 22 identical variables selected by XGBoost model and Randomforest model.4.Using the step-by-step method based on AIC to establish logistic regression,the BIC value of the model is 8673.50,and the accuracy of 5-fold cross validation is 75.09%.5.The results of logistic regression model showed that the OR value of depression related factors in this population was ranked from large to small in terms of living standard,marital status,memory,sleep,health,four kinds of pain(whether you feel chest pain when climbing or walking fast,whether you feel headache,whether you feel wrist pain,the degree of physical pain),living abilities(the difficulty with getting up from a chair after sitting for a long period,The difficulty with climbing several flights of stairs without resting,the difficulty with using the toilet,including getting up and down,the difficulties with doing household chores,the difficulties with managing money and the difficulty with running about 1 Km)and vision.The maximum OR was 3.431 for living standard,and the minimum was 1.091 for the difficulty with running about 1 Km.These variables are mainly from three data files: basic information(1,marital status),health status and function(14),work retirement and pension(1,living standard).Conclusions:1.When faced with cross-disciplinary and multi-dimensional data,the data must first be organized.The logistic model directly established after processing has high complexity,so XGBoost model and random forest model can be used to filter the variables to reduce the dimension.The logistic model established by using the filtered variables does not affect the accuracy of the model,and the model complexity is greatly reduced.2.In this paper,16 depression-related variables were selected,of which the most important one was living standard,followed by marital status,and then memory.There are four pain indicators,including chest pain,headache,wrist pain and the degree of pain.The indicators of life ability show that depression is more likely to occur with the decline of this ability.Sleep,self-rated health and low levels of vision are also associated with depression.
Keywords/Search Tags:Chinese over 45 years old, depression symptoms, related factors of depression, Randomforest, XGBoost
PDF Full Text Request
Related items