| With the progress of social development,people’s quality of life has been significantly improved,and the increased cardiovascular disease has also become the biggest health killer people face,acute myocardial infarction has the most serious impact on cardiovascular disease.The traditional diagnosis of acute myocardial infarction is limited by doctors and medical standards,which is prone to misdiagnosis or missed diagnosis,bringing a heavy burden to patients and medical personnel.This thesis uses machine learning algorithms to build a diagnostic prediction model for acute myocardial infarction,which aims to improve the diagnostic level of acute myocardial infarction and provide medical workers with auxiliary tools for the diagnosis of acute myocardial infarction.Firstly,the original data are preprocessed.The samples with many missing values and large missing rate of variables means that they can provide limited effective information and can be deleted directly.At the same time,the mode or K nearest neighbor method is used to specifically correct the data outlier.The number of patients in the sample is far less than the number of non patients,which means there is a classification imbalance problem,which significantly affects the generalization ability of the model.This thesis attempts three data imbalance processing methods,using ADASYN,SMOTE,and SMOTEENN algorithms to imbalance the original data,the model AUC values are 0.760,0.728,and 0.773,respectively.The results show that the imbalance processing method can better solve the impact differences caused by different types of samples,and can provide better diagnostic and predictive effects for acute myocardial infarction disease.Secondly,the sample features are extracted.The original data contain many characteristic attributes,many of which not only have no practical effect on the model’s prediction,but also increase the computational complexity of the algorithm,and may even interfere with the model’s performance.Therefore,this thesis uses Filtering,Wrapping,and Embedding methods for feature selection,uses F1 value as the evaluation criterion,calculates the correlation between features,removes features with high correlation,and ultimately selects 27 features.Then,this thesis uses Naive Bayes,Random Forest,Support Vector Machine and Neural Network to build a single prediction model.Compared with the evaluation indicators of each model,the F1 value of the Neural Network model reaches 0.827,the AUC value reaches 0.850,and the prediction effect of the single model is the best.In order to further improve the generalization ability of the model,three integrated models,Adaboost,Light GBM and XGboost,have been established.Compared with the evaluation indicators of each model,the evaluation indicators of the integrated model are higher than that of the single model.Finally,the Neural Network model with the best prediction effect among the single models is selected and combine with the three integrated models to build a Stacking composite learner,and test the robustness of the fused model,so as to assist medical workers and provide help for the diagnosis of acute myocardial infarction disease under the prediction framework,so that they can find the confirmed patients as early as possible,which is crucial for the treatment and prognosis of acute myocardial infarction. |