Heart failure(HF)is the final stage of the development of various heart diseases.The prognostic mortality of patients with heart failure is highly variable,ranging from 5% to 75%.Therefore,assessing the mortality of the prognosis HF patients is an effective way to enable clinicians to formulate more reasonable and scientific prognostic treatment plans based on the predicted mortality.And it can prevent the condition from further deterioration and reduce medical expenses.At present,there are two main research models for predicting HF mortality.One is the medical field model based on medical knowledge and statistics,and the other is the machine learning or deep learning model that relies on computer algorithm.These models have problems with insufficient utilization of patient characteristics,ignoring the impact of missing original data on the model,and data imbalance.Therefore,in order to solve the above problems,we extract data from the MIMIC-Ⅲ public database and establish machine learning and deep learning model to predict the all-cause mortality of HF patients.The main work of this paper is as follows:(1)We extract the data of 10311 HF prognostic patients that can be used for this research from the MIMIC-Ⅲ public database,and divide the patient’s death types into four categories.There are patients dead within 30 days,dead within 180 days,dead within 365 days,and dead after 365 days.We fully consider the patient information and summarize it into 7 different categories,namely demographic data,related disease information,medication information,surgery information,ICU information,and laboratory test items information and general inspection information.(2)We construct a K-nearest neighbor algorithm model based on mixed weighted distance to predict the mortality of HF patients within 30 days.First,the chi-square detection and logistic regression based on L1 regularization are used to select and rank the features.Next,in order to overcome the KNN problems that a single distance is difficult to accurately measure the distance of samples with discrete and continuous variables and the voting method cannot measure the impact of distance on the results,a mixed weighted distance is proposed.A mixture of Value Difference Metric and Manhattan distance are applied to calculate the distance.Then,the softmin function is chose to weight the distance and finally gave the category for testing sample.Finally,2743 heart failure data experiments without repeated random sampling proved the validity of the model.(3)We construct a convolutional neural network based on the multi-head self-attention mechanism.First,we reprocessed the missing data problem.An indicator vector indicating missing values is added.The indicator vector is used to indicate whether the characteristic value is a true value or a filled value,so as to further process the missing value and expand the data dimension.Next,convolution kernels with different sizes are used on a layer of the model to obtain the characteristics from different receptive fields.Then,applying multi-head self-attention mechanism before the fully connected layer to obtain the information of the entire channel is also an important step to improve the overall performance of the model.In addition,the Focal loss function is introduced to better solve the problem of data imbalance.Finally,in order to verify the effectiveness of the model and the black box problem of deep learning,Deep SHAP theory is used to analyze the deep learning model,and the 15 most important features that affect the mortality of the HF prognosis patients are obtained.And we combine with clinical medicine knowledge to analyze important characteristics.By comparing the performance with other models and the model interpretation,the validity and rationality of the model is further confirmed,and it provides a certain reference for doctors to formulate more reasonable treatment methods. |