Font Size: a A A

Construction Of A Prediction Model For Heart Disease Based On Analysis Of Physiological And Biochemical Indicator

Posted on:2024-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:S W ShuFull Text:PDF
GTID:2554306935463624Subject:Mathematical Statistics
Abstract/Summary:PDF Full Text Request
Human life is highly dependent on the proper functioning of the heart’s blood vessels.Improper blood circulation can lead to heart failure,kidney failure,imbalanced brain conditions and even immediate death.Heart disease can be predicted based on various indicators,such as age,gender,cholesterol levels,etc.Early diagnosis of heart disease is essential to minimise heart-related problems and protect against serious risks.Therefore,learning the risk factors associated with heart disease and combining the analysis of classification models in machine learning with medical data is more helpful for health services and medical professionals to identify patients at high risk of developing heart disease,therefore the main research work in this paper is to analyse and predict heart disease based on relevant cardiac indicators from existing datasets.(1)Data processing and variables introduction.Data downloaded from the UCI machine learning database were used to describe the values taken for each variable in the data,and variables were selected using a chi-square test combined with Lasso regression.The chi-square test was used to know that there was a correlation between each categorical variable and the dependent variable.After one-hot encoding of the data,the variable with zero Lasso regression coefficient in the new variable was removed to exclude the influence of irrelevant variables on the modeling.(2)Descriptive statistical analysis was performed on physiological and biochemical indicators.The physiological indicators in the data include three modules:ECG,heart rate and type of chest pain.Biochemical indicators are blood pressure,cholesterol level and fasting blood glucose concentration.Each of these indicators is explored in relation to the presence or absence of heart disease,and the specific meaning of each indicator and its effect on the human body is detailed.According to data,the rate of heart disease increases when abnormalities occur in physiological or biochemical indicators.However,the abnormality of one index is not the decisive factor for heart disease,and it is often judged by combining multiple indicators.(3)Classification prediction model is established.In this paper,four classification models,namely Logistic regression,Support vector machine,Light GBM and Random forest,were used to predict whether one would suffer from heart disease,and the Stacking algorithm was used to fuse the above four models to obtain a completely new model.Finally,the prediction effectiveness of the models was evaluated based on the accuracy,precision,recall,F1 score and AUC value of each model.Combined with the data in this paper,the best prediction model is finally selected according to the harmonic mean and AUC value of precision and recall under the premise of ensuring the accuracy.The results show that the Stacking fusion model has the highest F1 score of 89.89%,so it is the model with the best effect on heart disease prediction among the variable feature data contained in this paper.
Keywords/Search Tags:Heart disease prediction, Descriptive statistical analysis, Model fusion, Support vector machine, LightGBM, Random forest
PDF Full Text Request
Related items