Font Size: a A A

Research On Cross-Modal Transmission For Semantic Communication Networks

Posted on:2024-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:L L QinFull Text:PDF
GTID:2568307118983559Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Currently,traditional communication is facing challenges of information bottleneck,which includes channel capacity approaching Shannon limit,and source compression efficiency reaching the upper limit.Semantic communication,through encoding the semantics of messages,can eliminate data redundancy and reduce data transmission overhead,thus promising to alleviate the information bottleneck confronted by traditional communication.However,most research prototypes of semantic communication are aimed at a single modality at present,which is limited its practicality,and semantic communication systems for image sources typically utilize low-level or segmented semantic information,while the system using high-level semantic information to reconstruct images remains unknown.To address these issues,this thesis proposes an Image-to-Text-to-Image(I2T2I)cross-modality semantic communication system with the task of transmitting images.By utilizing the text semantic information of image,the image compression technique is shifted from traditional pixel-level reconstruction to semantic-level reconstruction based on deep learning.The specific research contents and innovations as follows:1.To enhance the cross-modal semantic extraction ability of I2T2 I cross-modal semantic communication system,the Image-to-Text(I2T)cross-modal semantic encoder is first investigated.The encoder employs a bilinear attention mechanism to construct a multi-head attention module and introduces an optimized attention network to address the problem that traditional attention mechanisms only focusing on low-level feature interactions of the image,resulting in incomplete generalization of semantic information for images.The simulation experiments show that the proposed I2T crossmodal semantic encoder in this thesis can comprehensively understand the semantic information of image content compared with other models in various evaluation metrics.2.To promote the capability of reconstructing semantic information in the I2T2I cross-modal semantic communication system,the Text-to-Image(T2I)cross-modal semantic decoder is first investigated.The decoder can alleviate the phenomenon of competition between multiple generators in the current stack-based text-to-image generation models,resulting in semantic deviation issues for the generated images.Specifically,this thesis designs a multi-information fusion module that can fully utilize the advantages of each generator to ensure the high quality of the final generated image.Secondly,a new text-image similarity model is proposed in this thesis,which incorporates position information encoding and fully considers the similarity information of adjacent encoding in the image to minimize the loss of semantic information in the image.The experimental results show that the proposed T2I crossmodal semantic decoder in this thesis has better semantic understanding ability than the baseline model and can reconstruct images with high fidelity.3.The current semantic communication prototypes mainly focus on single modality and the semantic communication systems that take images as sources pay more attention to low-level or segmented semantic information.To address the issue,this thesis proposes an I2T2I cross-modal semantic communication system that utilizes the text semantic information to reconstruct images.To reduce the data transmission overhead,only the word index in the vocabulary is transmitted when transmitting the text semantic information of the image.Secondly,to ensure the semantic similarity between the generated and input images,this thesis designs a semantic consistency module using a siamese network and a contrastive loss function to measure the semantic consistency problem between the input and reconstructed images.In addition,transfer learning is applied in the system by using pre-trained models of I2T cross-modal semantic encoder and T2I cross-modal semantic decoder to achieve end-to-end training.Finally,experimental results verify the effectiveness and feasibility of the proposed I2T2I cross-modal semantic communication system.
Keywords/Search Tags:cross-modal semantic communication, semantic consistency, end-to-end communication, transfer learning, deep learning
PDF Full Text Request
Related items