Font Size: a A A

Identifying Important Risk Factors For Cardiovascular Events In The Rural Population Of Liaoning Based On Randomized Survival Forests

Posted on:2024-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:L R MengFull Text:PDF
GTID:2544307088977519Subject:Public health
Abstract/Summary:PDF Full Text Request
Objective: Cardiovascular event prediction has been a cornerstone of cardiovascular medicine,and the most widely used research method is the Cox proportional risk regression model.Over the past half century,there have been significant advances in cardiac screening technology and medical care,including the ease of laboratory testing and the popularity of electrocardiography and echocardiography,which have opened up more possibilities for researchers to identify new risk factors for cardiovascular events.However,the Cox proportional risk regression model is limited by the proportional risk assumption and the assumption that the covariates in the model are linearly related to the log-risk ratio,resulting in a model that does not perform well in dealing with data with complex internal structure.Random Survival Forest(RSF),one of the most representative algorithms in machine learning,is a powerful alternative to the Cox proportional risk regression model because of its powerful learning ability and flexibility.Current studies based on Western populations confirm the superior performance of the RSF model in predicting cardiovascular events and suggest the inclusion of additional non-traditional risk factors in the prediction model.However,the current domestic population-based RSF model for cardiovascular event prediction has not been adequately studied,so considering the differences in population,race and region,this study will identify important risk factors for cardiovascular events in the rural population of Liaoning based on randomized survival forest.Methods: Between July 2012 and August 2013,a cohort of 11,956 representative adults aged 35 years and older in rural Liaoning was selected using a multistage stratified whole-group sampling method to establish the Northeast Rural Cardiovascular Health Study.The baseline population was followed up to collect events in 2015 and again in2018,respectively.After excluding individuals with missing visits and those with missing target variables,a total of 8723 study subjects were finally included in the analysis.Ninety-eight risk factors collected at baseline were used to predict cardiovascular outcomes during follow-up.In this study,the RSF was used to determine the top 20 risk factors for cardiovascular events,and the model performance was evaluated using the Concordance index(C-index)and Brier score proposed by Harrel and compared with the Cox proportional risk review model.The RSF model was used to generate partial dependency maps and delineate the threshold effects of important risk factors,and Kaplan-Meier survival analysis was performed using the thresholds identified on the partial maps.Measures are expressed as means and standard deviations,and counts are expressed as frequencies and composition ratios.Differences between groups were compared by independent samples U test or Wilcoxon rank sum test for continuous variables and by chi-square test or Fisher exact test for categorical variables,and differences were considered statistically significant by two-sided test P ≤ 0.05.Results:1.Basic information of this study cohortThe study cohort included 8723 study subjects,of whom 445 had cardiovascular disease,115 had cardiovascular death,153 had coronary artery disease,and 303 had stroke.2.Importance ranking and thresholds for identifying risk factors for cardiovascular eventsThe most important risk factor for Cardiovascular Disease(CVD)is age(minimum depth 1.50,threshold 60 years),followed by systolic blood pressure(2.99,130 mm Hg),diastolic blood pressure(3.97,80 mm Hg)and septal thickness(4.20,0.9 cm or 1.1cm).The most important risk factor for cardiovascular death was age(minimum depth1.97,threshold 60 years),followed by left ventricular ejection fraction(5.45,50%),aortic valve flow velocity(5.82,90 cm/s or 170 cm/s)and diastolic blood pressure(6.68,90 mm Hg).The most important risk factor for coronary artery disease was age(minimum depth 1.97,threshold 60 years),followed by left ventricular ejection fraction(5.71,50%),aortic valve flow velocity(6.58,90 cm/s or 170 cm/s)and pulse pressure difference(7.24,60 mm Hg).The most important risk factor for stroke was age(minimum depth 1.97,threshold 60 years),followed by systolic blood pressure(2.94,130 mm Hg),diastolic blood pressure(3.34,80 mm Hg)and septal thickness(5.15,0.9 cm or 1.1 cm).3.Performance of RSF modelThe randomized survival forest models used to predict cardiovascular disease,cardiovascular death,coronary heart disease,and stroke all outperformed the traditional Cox proportional risk model with C-index of 75.44% vs 73.96%,80.33% vs 74.57%,71.85% vs 68.44%,and 76.61% vs 75.42%,respectively,and Brier scores of 0.027 vs0.029,0.009 vs 0.010,0.009 vs 0.010,and 0.009 vs 0.010.Conclusion:1.The results of this study showed that RSF outperformed the traditional Cox proportional risk regression model in predicting cardiovascular events,indicating that RSF is suitable for risk prediction of cardiovascular events in large-scale epidemiological surveys.2.In the RSF model with 98 variables constructed for this study cohort,age,blood pressure status,ultrasound indicators,and non-conventional blood tests may provide greater predictive value and threshold effects were analyzed for the top 4 variable importance rankings,whereas the traditional variables of smoking,alcohol consumption,HDL,LDL,triglycerides,and cholesterol failed to make it into the top 20 importance rankings,indicating that they may provide less predictive value.
Keywords/Search Tags:Random Survival Forests, Cardiovascular Events, Rural Population, Risk Factors
PDF Full Text Request
Related items