Font Size: a A A

Medical And Environmental Applications Of Deep Learning-based Multi-modal Fusion

Posted on:2022-03-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:S L ZhangFull Text:PDF
GTID:1484306323462934Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
Computer-aided diagnosis and environmental monitoring systems are of great sig-nificance to human health.The wide application of multi-sensor technology in medical-aided diagnosis and environmental monitoring systems provides a good data foundation for improving the performance of medical and environmental monitoring systems by using multi-modal learning.While multi-modal data can provide more abundant input information,the redundancy and noise of information also make it difficult to extract features effectively.Deep learning-based multi-modal fusion performs well in feature extraction and fusion and is widely used in multi-modal data information processing.However,in some tasks with limited annotation data,especially in the medical and en-vironmental monitoring fields,deep multi-modal fusion learning still faces many diffi-culties such as feature extraction invalidity and over-fitting.To address the problems of fundus arteriovcnous segmentation and NO2 detection with very limited labeled data,with careful design of deep learning network structure,loss function,and modal fea-tures,as well as the improvement of feature fusion,we constructed three types of deep multi-modal information fusion networks using task coordination,multi-stage fusion,and attention fusion,respectively.The networks significantly improved the efficiency of feature learning in the two multimodal applications,reduced information redundancy and overfitting,and improved the expression strength and fusion performance of impor-tant modal private features.The research content is as follows:1)Arteriovenous segmentation of multi-modal fundus images using a triple cascade network based on task collaboration.To improve the fusion network of multi-modal retinal image feature extraction ability and get the best representation of the target task,we divided the arteriovenous segmentation task into vessel seg-mentation and vessel pixels classification.And we designed a triple cascade net-work based on task collaboration according to the relation between vessel seg-mentation and arteriovenous segmentation.The network consists of three sub-networks corresponding to three segmentation tasks.The 5-fold cross-validation experiments on the DRIVE and dual-modal fundus dataset demonstrated that the network could significantly improve arteriovenous segmentation and avoid over-fitting.Besides,to improve the performance of the task,we innovatively used the imaging results at 610 nm and 570 nm,which can highlight the appearance difference of arteries and veins.We created the first dual-modal fundus arteri-ovenous segmentation dataset(DualModal2019)and made it public for others to reproduce and improve.2)The fundus arteriovenous segmentation using a multi-stage fusion network com-bining with oxygen saturation information.To reduce information redundancy caused by pixel-level fusion of multi-modal images,we proposed a multi-stage fusion network by providing each sub-network with the most directly related modality data according to the correlation difference between each sub-task and modalities.It alleviated the over-fitting of the fusion network.Considering the semantic correlation between vessels and arteriovenous pixels,based on the orig-inal arteriovenous segmentation loss,we designed a new enhancement loss based on predictive vessels.Increasing the weight of arteriovenous segmentation loss in predicting vessels made the network focus more on the pixels segmentation of error-prone locations.Because oxygen saturation was much more distinct be-tween arteries and veins,we innovatively applied it as a new modal feature in the fusion network to further improve arteriovenous segmentation.3)A fusion network based on attention mechanism was used to predict the con-centration of nitrogen dioxide in three-dimensional space in the Beijing-Tianjin-Hebei region.As the importance of different modal data and tasks is different,we design an attention-based network to fuse the private features from different modalities of the meteorological field and chemical field.By dynamically gener-ating a set of weight vectors,the model can adaptively assign more weight to the features of important modalities.It strengthened the expression intensity of pri-vate features from important modalities,reduced the interference of modal data that is not essential,and improved the efficiency of information fusion.We ccl-lected the vertical profile observations of NO2 from the limited sparse ground remote sensing site in the Jing-Jin-Ji region.The fusion network was trained with the input of the meteorological and chemical data from WRF-CHEM simulation.We achieved a 3D prediction of NO2 within Beijing and its surrounding areas(1 15.005-1 17.905E and 39.005-41.405N)with 24-hour coverage.
Keywords/Search Tags:Multi-modal, Deep learning, Information fusion, Arteriovenous seg-mentation, Prediction of NO2, Task coordination, Multistage fusion, Attention
PDF Full Text Request
Related items