Font Size: a A A

Research On Cardiovascular Disease Prediction Based On Ensemble Learning

Posted on:2024-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:W LiuFull Text:PDF
GTID:2544307052993409Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the change of our citizen lifestyle and the aging of the population,unhealthy lifestyles are becoming increasingly severe,and the incidence of CVD in China’s urban and rural areas is increasing year by year,and its death rate has surpassed that of other diseases such as malignant tumors,respiratory diseases and so on.Two of the five deaths of urban and rural residents in China belong to cardiovascular diseases.CVD not only pose a serious threaet to human health,but bring a enormous burden to the social economy,which has become a major public health problem.Recently,on account of the high accuracy of machine learning,in especia integrated learning,the data analysis and prediction of various industries,it has become a mega-trend of combining machine learning algorithm with the medical field opens up opportunities for disease research in the future,providing a new direction for studying CVD that pose a serious threat to human health.This paper conducts a prognosises study of CVD on machine learning in combination with the body indicators provided by the respondents.From the 11 indicators of the respondents’ weight,height,sex,age,blood sugar level,smoking status,cholesterol level,systolic blood pressure level,drinking status,diastolic blood pressure level,the body mass index BMI,exercise status,and cat that facilitate the study of height and weight are introduced through feature changes_Bmi and cat for studying systolic and diastolic blood pressure_blood_Pressure,combined with feature selection,the main risk factors of cardiovascular disease were screened as follows: age,sex,cat_bmi、cat_blood_Pressure,cholesterol level,blood sugar level,smoking status,drinking status,exercise status,and detailed descriptive statistics of various risk factors were made to analyze whether each risk factor has an impact on cardiovascular disease.Next,we will establish a cardiovascular disease prediction model to predict whether we have cardiovascular disease.We will respectively build a fusion model based on K-nearest neighbor,logical regression,random forest,XGBoost,Light GBM,and logical regression model,XGBoost model,Light GBM model as the primary learner.The random forest model as the secondary learner uses10 fold cross validation to train the model,and the random search method adjusts the parameters,The classification results of each prediction model are obtained.The above analysis is achieved through Python software.On the whole,the accuracy,equilibrium accuracy,recall and F1 value of the fusion model are the best among all models.It is believed that the fusion model has the best classification performance for cardiovascular diseases,and the performance of the integrated algorithm prediction model is better than that of traditional algorithms,while the performance of the Stacking blend model is better than that of the integrated algorithm prediction model.
Keywords/Search Tags:Cardiovascular disease, Integration algorithm, Stacking model, Random searching
PDF Full Text Request
Related items