| Myocardial infarction is a common cardiovascular disease,and various factors need to be considered during the diagnostic process by doctors.Coronary artery imaging can provide a visual reflection of the cardiovascular status of the patient and is one of the important indicators for doctors to diagnose this disease.Blood indicators such as troponin and neutrophils are also important reference indicators.In recent years,with the development of society,chronic diseases such as hypertension and diabetes have been growing rapidly,and clinical research has found that these diseases are closely related to myocardial infarction.The integration and analysis of various factors of the patient often requires a lot of work for doctors.To effectively reduce the burden on medical staff,this paper aims to study a method to integrate multiple modal data and generate corresponding user profiles and treatment suggestions to assist doctors in diagnosing myocardial infarction.Compared with single-modal models,multi-modal models have stronger information capturing capabilities and better stability when encountering abnormal data.To address the problem of the lack of multi-modal myocardial infarction datasets,this paper uses data from various departments such as the Imaging Department and the Cardiovascular Department of a hospital in Jilin Province to construct a multi-modal dataset with a balanced variety of data types and sufficient quantities.To improve the quality of the data and avoid the influence of abnormal data,this paper preprocesses the constructed dataset.Then,a multi-layer perceptron(Mlp_V1)is selected as the feature extractor for the numerical modality,and Inception modules and Dense Net ideas are introduced to construct Mlp_V2,which extracts more numerical modality feature information to improve the overall prediction effect of the model.In addition,this paper proposes a cross-modal feature fusion module(CMFFM)that adaptively assigns different modal feature weights during the model training process,enhancing the performance of feature fusion between different modalities.Through comparative experiments,this paper verifies that the introduction of Inception modules and Dense Net ideas enhances the feature extraction ability of the numerical modality model.Compared with the basic multi-layer perceptron Mlp_V1,Mlp_V2 has an accuracy improvement of 2.03%.The model using the CMFFM module for cross-modal interaction has a 1.66% improvement in accuracy compared to the model using the concatenation feature fusion method.By replacing the image modality feature extraction network in the multi-modal model,this paper further illustrates that the multi-modal model based on Mlp_V2 and CMFFM has performance improvement compared to different single-image modality models.In addition,compared with different single-modal and multimodal baseline methods under the same conditions,the multimodal model proposed in this study achieved an accuracy of 87.96%,with a performance improvement compared to the single-modal network model,and different levels of performance improvement compared to the multimodal model.Finally,in order to increase the practicability and effectiveness of this study,a simple Assisted diagnosis web application was developed based on the multimodal feature learning method proposed in this paper,which is used to establish user portraits of patients and provide medical staff with corresponding diagnostic suggestions.Through case analysis of user profiles,the sensitivity of the system model to subtle changes in multi-modal data is demonstrated. |