Font Size: a A A

Research On Key Technologies For Stroke Medical Data Mining

Posted on:2023-08-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:1524306908462334Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Stroke is an acute cerebrovascular disease with high morbidity,high mortality and high disability rate,which seriously affects the life quality of patients and brings heavy economic burden to society and families.With the development of medical informatization,data mining and other technologies are widely applied in clinical medicine.Extracting information from medical data,representing it effectively,and building scenario model according to clinical demand are the keys of medical data mining research,which has important research significance and practical value.Stroke medicine data mining refers to scientific representation of medical data through mining and analysis for the medical decision support of stroke diagnosis and treatment.Heterogeneity,temporality,redundancy and polymorphism of medical data bring challenges to data mining.In this thesis,we concentrate on the important clinical scenarios of stroke,study the key technologies used in medical data mining,and the specific research work is as follows:(1)To settle the problems of multi-source and data bias of medical data set,this thesis propose an adverse event prediction method based on data optimization.Specifically,we preprocess clinical data based on medical domain knowledge,and then comtribute a feature selection methods based on set theory for feature selection.Bias coefficient based on multi angle data distribution weights the final feature set,and followed by employing XGBoost model optimized by grid search to achieve the target task.SHapley Additive ex Planations and Partial Dependence Plot finally interpret model results visually to explain the clinical value of the model results.Experimental results show that this method could predict adverse events of stroke with high accuracy compared with other methods and reveal intuitive interpretation of clinical value.(2)To handle medical data temporal information mining and correlation representation,this thesis propose a mortality risk prediction method based on correlation representation.Specifically,we design multi task feature reconstruction module to construct the temporal standardized data containing category attributes.BiLSTM then learns the temporal dependency within the data,and is equipped with the correlation attention mechanism to mine the correlation of feature categories.Finally,we interprets and analyzes the clinical value with attention score,which is according to the guidelines of stroke diagnosis and treatment.Experimental results show that this method significantly outperforms competing basedlines,and verifying the intpretability of our correlation attention mechanism.(3)To cope with the problems of multi-dimensional information mining and redundant calculation of medical data,this thesis propose a risk factor assessment method based on multi-dimensional feature fusion.Focusing on four important risk factors,we construct a gated correlation graph convolution neural network,which is based on the Pearson matrix of patient baseline data,to learn the complex spatial relationships.Then we adopt Bi-LSTM further superimposes the temporal dependence contained in the patient information,and upgrade gated correlation attention mechanism to capture the temporal correlation.The predictive value of mortality risk for different features and different time stamps are verified by medical domain knowledge.Experimental results show that compared with other baseline methods,this method could effectively improve the performance of mortality risk prediction,and obtain better assessment effect of mortality risk factors.(4)To solve the problems of high cost of medical data annotation and causality representation,this thesis propose an etiological subtyping method based on deep active learning.Firstly,we reconstruct data sets based on feature importance,and provide multi-angle fusion causal convolutional neural network to represent feature causality in the active learning cycle.Secondly,with help of mixed uncertainty,more high-quality data are added in the iteration to reduce the cost of data annotation.Aiming at data imbalance,we contrtibute KL-Focal Loss to prevent model overfitting in the active learning cycle.Experimental results show that this method simulates the medical thought of doctors and explores the design idea of intelligent computing model of disease.It could provide more accurate results of various etiologies than other methods.
Keywords/Search Tags:Medical Data Mining, Data Representation, Correlation, Active Learning, Bias Elimination
PDF Full Text Request
Related items