Font Size: a A A

Machine Learning-based Cardiovascular Disease Prediction Study

Posted on:2024-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z ZhangFull Text:PDF
GTID:2544307187958049Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Hemodialysis is an excellent method of kidney cleansing for patients with both acute and chronic renal failure,having kidney failure can lead to various complications and seriously affect the life and well-being of patients.Among them,cardiovascular diseases are the most common.Therefore,timely screening and prediction of cardiovascular disease patients caused by hemodialysis,early detection of problems,and intervention are particularly important for improving patient survival rate and post disease treatment.This dissertation proposes a machine learning based data preprocessing method for data missing in medical data and verifies its effectiveness in different feature selection methods.After preprocessing and feature selection,a machine learning based cardiovascular disease prediction model is proposed.The main research content is as follows:(1)A modified missing value filling method based on hot card filling and missing forest(HDI-MF)is proposed to address the issue of missing data in medical data.The effectiveness of this method is verified through simulated and actual data.Experimental studies have shown that the proposed HDI-MF improved missing value filling method can effectively improve the filling effect of missing values in medical data.(2)Regarding the issue of excessive irrelevant variables and high latitude in medical data,multiple methods are used to screen for factors related to cardiovascular disease.To obtain the optimal feature subset for prediction,the results of cross validation are used as the basis,and the Kaplan Mayer risk analysis method(KM)is used to test the correlation between the selected feature variables and cardiovascular disease.Research has shown that encapsulated feature selection methods screen out the optimal subset of features.(3)The methods of single model and integrated model based on machine learning are respectively used to analyze and study the data of cardiovascular disease patients.Grid search and cross validation methods are used to verify that the above methods optimize the model parameters,and the model with the best prediction effect is determined according to the experimental results.Regarding the problems in non-linear models,such as poor interpretation and difficulties in application,the Shapely explanatory SHAP method was used to visualize the effect of the covariates on the results.Finally,a column chart is constructed to represent the correlation between variables and cardiovascular disease incidence.Finally,a non-linear ensemble model based cardiovascular risk assessment tool for hemodialysis patients is proposed,which can assist doctors in clinical diagnosis and treatment.
Keywords/Search Tags:Machine earning, Disease prediction model, Missing value Imputation, Feature selection
PDF Full Text Request
Related items