| Heart failure(HF)is the end stage of the development of various heart diseases.One of the characteristics of heart failure is the high mortality rate,which is up to 75% and it threatens human health seriously.Therefore,it is crucial to assess the prognosis of heart failure patients scientifically and effectively,in order to assist physicians in the scientific treatment of patients,as well as affect the patients’ healthy life positively.Currently,there are still many problems in the research of prognostic assessment of heart failure,such as inadequate utilization of patient information,data imbalance,and insufficient analysis of influencing factors.To solve the above problems,this thesis constructs a prognostic assessment model with machine learning algorithm based on electronic health record(EHR)to achieve the ranking of important influencing factors of heart failure and the prediction of mortality within 30 days for heart failure patients.The main work of this thesis is as follows.(1)This thesis collected medical information of 1140 patients from Bethune Hospital in Shanxi that could be used in the 30-day heart failure prognostic mortality study,and built the dataset.To adequately measure the clinical status of heart failure patients,this thesis took into account five categories of information to establish the heart failure dataset named HF,including general examination indicators,relevant diseases,medication,hospitalization information,and laboratory examination indicators.(2)This thesis of prognostic factors in patients with heart failure,the improved algorithm NMI-Relief based on neighborhood mutual information and Relief was proposed for feature selection.First,the neighborhood mutual information is added to the Relief algorithm to reduce the redundancy between features,and a minimum redundancy metric is established to achieve feature ranking.Secondly,in order to make the performance of the algorithm more stable,some continuous features are discretized and vectorized based on the clinical statistical experience of doctors.Then,the algorithm is used for feature ranking and medical analysis on the HF dataset.The results show that the NMI-Relief algorithm can effectively improve the prediction performance of the model,and the F1 score is increased by 5.45%.Finally,feature ranking and medical discussion are performed on three subsets of HFr EF,HFmr EF,and HFp EF,respectively.The ranking and analysis of important factors affecting the survival of heart failure patients can improve the effect of mortality prediction and provide some reference for physicians to make scientific guidance on patient prognosis.(3)Prognosing the mortality prediction of heart failure patients in this thesis,an adaptive boosting model MK-SVM-Ada Boost based on multi-core support vector machine was constructed.First,the combined kernel function of Polynomial kernel and Sigmoid kernel is used to build the basic classifier named MK-SVM,then the Ada Boost algorithm is used to integrate the MK-SVM.To solve the problem of data imbalance,this thesis uses the combined sampling technology of SMOTE and Tomek links to process the data.Compared with other machine learning models,the MK-SVM-Ada Boost model achieved better mortality prediction with mean Acc,mean F1 score and mean Mi A-AUC of 84.69%,84.1% and 89.5%,respectively.In addition,the effectiveness and stability of the NMI-Relief algorithm were further verified through feature selection comparison experiments.In summary,this thesis established the heart failure dataset,employed the NMI-Relief algorithm to sort and analyze the important characteristics affecting the prognosis and survival of patients with heart failure,and further constructed the MK-SVM-Ada Boost model on this basis to achieve the effective prediction of mortality within 30 days for heart failure patients,so as to better serve the clinical prognosis of heart failure. |