Font Size: a A A

Feature Fusion Mechanism And Applications Of Deep Neural Networks

Posted on:2022-10-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:1488306734471844Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the most important means of artificial intelligence,deep neural networks(DNNs)have made breakthroughs in the fields of speech recognition,natural language processing,image recognition and many other fields,which have set off a new tide of research and applications in artificial intelligence.Features play an essential role because the generalization of feature rep-resentation in networks greatly impacts the performance of DNNs.The robust feature general-ization ability is one of the most important reasons for the widely used of DNNs.Feature fusion mechanism is designed to reuse and optimize the combination of features in networks,aiming at further strengthening the feature extraction ability of DNNs.This dissertation focuses on the feature fusion mechanism of DNNs.Firstly,several single-input feature fusion models are proposed,including a frequency-based fusion model for high-dimensional speech data and an attention-based multi-scale feature fusion model for different size of lesions in medical images.Next,as for the multi-input models,the independent sub-networks and weight-sharing archi-tecture for same-modality multi-input data and multiple cross-connected feature fusion model for multi-modality are proposed,respectively.Finally,based on these models,several intelli-gent medical assisted-diagnosis systems are developed and deployed in West China Hospital of Sichuan University and Dazhou Central Hospital to aid doctors in improving efficiency.The main contributions of this dissertation are as follows:1.The single-input model for audio classification is studied.A local feature expression and feature fusion method based on speech frequency domain are proposed to solve the diffi-culty of learning high-dimensional speech in a limited dataset.The attention mechanism is further introduced for the combination of different features,which effectively improves the performance of DNNs in several audio classification tasks.This dissertation aims at the spectrogram data of speech,according to the characteris-tics of time domain and frequency domain,the global and local feature fusion module is proposed.Further,a general feature fusion model based on spectrograms is constructed,which can easily transfer to multiple audio classification tasks.In addition,this disserta-tion applies the attention methods to adjust the feature fusion module,by computing the weight between the global and local features,which effectively improves the accuracy in audio classification tasks.The experimental results demonstrate that the proposed fea-ture fusion model based on frequency domain have achieved better results in three public datasets,including UT-Podcast,CHAINS and e NTERFACE.2.The single-input model for automatic recognition of digestive diseases is studied.Aiming at the problem of multi-scale lesions in medical images,a multi-scale feature fusion model based on attention mechanism is proposed,which improves the accuracy and sensitivity of automatic recognition of digestive diseases on the multi-center dataset.The intelligent aided-diagnosis system is developed to realize automatic detection of the upper and lower digestive diseases.The intelligent recognition of digestive diseases refers to the application of computer-aided technology to automatically distinguish normal,polyps,erosion,ulcers and other symptoms.In this dissertation,a multi-center and multi-disease digestive medical im-ages dataset is constructed.A three-stage recognition model is proposed,including the backbone feature extraction network,the multi-scale and attention mechanism feature fusion module and feature classification.The training of the proposed model is based on pre-trained weights in the backbone firstly and then learns the feature representation by using deep learning algorithm.Experiments demonstrate that the proposed feature fusion model based on single-input digestive endoscope images is superior to multiple comparison methods,and the visual results are analyzed.Due to the time-consuming and labor-intensive examination of digestive endoscopy,this dissertation has developed an ar-tificial intelligent-aided system based on the proposed model,which can assist doctors in diagnosing digestive diseases in real-time.The system has been deployed in West China Hospital of Sichuan University and Dazhou Central Hospital to improve the quality and efficiency of working process.3.The multi-input model for automatic recognition of ultrasound images is studied.Aiming at the problem of low-contrast in ultrasound images,a two-input feature fusion model is proposed,which reduces the influence of lesion detection results on subsequent recogni-tion results in the two-stage method and improves the sensitivity and specificity of deep neural network model in ultrasonic image recognition task.The most work mainly uses two-stage recognition method,by first locating the region of interest to reduce the recognition range and remove redundant information,and then identifying the target region.This dissertation mainly improves the two-stage recognition method from two aspects.Firstly,in the first stage of target detection task,a weighted en-semble method is proposed to improve the effect of target detection.Secondly,in the sec-ond stage of recognition task,compared with the traditional single-input model,this paper proposes a two-input feature fusion model framework based on the original image and the region of interest,which mainly includes three modules: independent sub-networks for feature extraction,multi-input feature fusion module and weight sharing module.To our knowledge,the largest multi-label renal ultrasound images dataset has been constructed in this paper.The experiments explore the influence of the detection results in the first stage on the single input model and the multi-input feature fusion model.The results con-firm that the multi-input feature fusion model is better than the single-input recognition model.4.The multi-input model for multimodal data classification is studied.Several crossed-connected feature fusion modules are proposed,and a multi-interactive feature fusion model is constructed.The proposed methods effectively improve the recognition perfor-mance of multiple multi-modal tasks.An intelligent assisted-diagnosis system based on multimodal data of coronary heart disease is developed.The common deep neural networks usually use the late fusion method to improve the performance.However,when it comes to the limited dataset,the huge parameters of the late fusion model can cause the problem of over-fitting.This dissertation proposes sev-eral crossed-connected feature fusion modules,and constructs a multi-interactive feature fusion model by superimposing fusion modules and late fusion operators.The proposed model is applied to IEMOCAP and the multi-modal coronary heart disease dataset,which is constructed in the paper.The experimental results show that the multi-interactive fea-ture fusion model is better than the later fusion model.We further cooperated with doctors to develop an intelligent assisted-diagnosis system for coronary heart disease.The system realizes the functions of automatically predicting the probability of major events of coro-nary heart disease.At present,the system has been clinically tried out in the Department of Cardiology of West China Hospital of Sichuan University.
Keywords/Search Tags:deep neural networks, feature fusion, single-input audio classification, medical images recognition, intelligent medical assiste-d-diagnosis applications
PDF Full Text Request
Related items