Font Size: a A A

Visual Signal Reconstruction In Cross-modal Communication

Posted on:2023-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2568306836468494Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Cross-modal communication is a new communication technology aiming at coordinated transmission and comprehensive processing of visual,audio,and haptic signals,which can bring multi-dimensional integration and richer immersive sensory experience to multimedia services.However,there are essential differences between visual,audio,and haptic signals,and the communication process is easily interfered by factors such as transmission capacity,resulting in different degrees of loss of received signals.Researching effective cross-modal transmission strategies to ensure the quality of multimedia services has become one of the major technical challenges in the 5G era.The traditional transmission scheme transmits and processes each modal signal separately,which cannot meet the requirements of low delay,high reliability,and high throughput at the same time.Therefore,it is very meaningful to design an effective and general crossmodal transmission strategy that can transmit tactile signals completely,while reconstructing damaged visual signals using tactile signals,and indirectly realizing lossless transmission of multimodal signals.Using the potential correlation between visual and tactile modalities,this paper proposes a transmission mechanism on the sender side and a cross-modal image reconstruction method based on the attention mechanism and cycle consistency on the receiver side and designs a large number of experiments to verify the effectiveness of the method.The main work of this paper includes the following aspects:(1)Design and implement the transmission strategy architecture based on the cross-modal communication system.Specifically,based on the existing preemptive restoration scheduling,a method for visually assisting haptic content compression is added,that is,the haptic signals with similar visual content are further compressed by using the semantic correlation characteristics between different modal signals;secondly,using the cross-modal signal reconstruction method proposed in chapters 4 and 5 to recover the delayed or lost visual signal at the receiver;finally,the effectiveness of the transmission strategy architecture is verified by the built cross-modal communication platform.(2)A cross-modal image reconstruction method based on an attention mechanism is proposed to achieve signal recovery.An attention mechanism network is built to explore strong matching properties between modalities,to avoid weakly paired data misleading the model to learn wrong information.At the same time,deep-level semantic mining for strongly paired modal data can further enhance the association between different modalities,including constraining individual features in each modality and shared semantic features between different modalities.Finally,a comparative experiment is designed based on the LMT standard visual and tactile database to verify the effectiveness of the proposed cross-modal image reconstruction method and evaluate the performance of the model.(3)A cross-modal image reconstruction method based on cycle consistency is proposed to achieve signal recovery.By optimizing the attention mechanism network proposed in Chapter 4,the pairing speed of modal data is further reduced,and the pairing ability of the attention mechanism network is improved.At the same time,combined with the cyclic consistency of the mutual conversion of different modal domains,it is beneficial to the generalization of the training model and ultimately improves the quality of the generated images.Finally,based on the LMT dataset and the actual platform design in Chapter 3,the results show that the model proposed in this chapter is better than Chapter 4 in terms of convergence speed and quality of reconstructed images.
Keywords/Search Tags:cross-modal transfer, transfer mechanism, signal recovery, correlation, attention mechanism, image generation
PDF Full Text Request
Related items