Font Size: a A A

Research On Heterogeneous Multisource Multimodal Data Fusion Based On Digital Twin

Posted on:2024-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:C MengFull Text:PDF
GTID:2568307112960459Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the advent of Industry 4.0,digital twin technology has become a key enabling technology for Industry 4.0,and with the high speed development of sensor,information acquisition,information transmission and database technologies,digital twin especially for heterogeneous multi-source multimodal data processing requirements continue to increase,which brings about problems such as poor quality of data information extraction,poor data fusion and poor common learning results.These problems have become one of the main obstacles limiting the development of digital twin technology at the decision-making level.At present,heterogeneous multi-source multimodal data fusion method based on deep learning is an important method in this field,which has received high attention from scholars with important research significance driven by Industry 4.0.In this paper,we summarize the previous research results and investigate the heterogeneous multi-source multimodal data fusion method based on digital twin.This paper first proposes a feature extraction module for text and image data,which can realize the key information extraction work of text and image data;then,in order to realize heterogeneous multi-source multimodal data fusion,a tensor decomposition fusion module HSM-Net is proposed.The main research contents of this paper are as follows.(1)In heterogeneous multi-source multimodal data fusion methods based on deep learning methods,feature extraction of data is an important part.The quality of feature extraction has a crucial impact on the subsequent fusion results.In this paper,we investigate the extraction methods for text data from the perspective of data feature extraction.This paper combines Bert pre-training model,MLP and residual network to enable it to discriminate the meaning of the whole text expression,and uses the advantages of residual network to effectively avoid the problems such as gradient disappearance or gradient explosion.(2)In order to provide high-quality data features for subsequent data fusion work,a Vision Transformer-based feature extraction module is proposed for image data,which can effectively combine the advantages of CNN local contextual information and Vision Transformer global contextual information.The module contains three stages,each of which consists of CNN and Vision Transformer,to reduce the number of parameters step by step and improve the computational speed while realizing that CNN processes the underlying features and Vision Transformer processes the higher-order information to provide better data features for the subsequent fusion work.(3)Due to the heterogeneity of data types between text,contour,segmentation and style data,it is difficult to fuse them effectively if traditional data fusion methods are used.In order to make full use of the effective information of many different modal data,a fusion module based on tensor decomposition is proposed,which performs feature extraction for the information of different modalities,then performs encoding operation based on feature vectors,and then performs mapping to high-dimensional space,and uses tensor decomposition theory to decompose high-rank tensor into low-rank tensor,and then performs fusion operation,which effectively fuses potential information of higher dimensions and further contains more rich and effective information.
Keywords/Search Tags:Deep learning, Data fusion, Heterogeneous multi-source multimodal data, Image composition
PDF Full Text Request
Related items