Research On Cross-Modal Transmission For Semantic Communication Networks

Posted on:2024-08-21

Degree:Master

Type:Thesis

Country:China

Candidate:L L Qin

Full Text:PDF

GTID:2568307118983559

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Currently,traditional communication is facing challenges of information bottleneck,which includes channel capacity approaching Shannon limit,and source compression efficiency reaching the upper limit.Semantic communication,through encoding the semantics of messages,can eliminate data redundancy and reduce data transmission overhead,thus promising to alleviate the information bottleneck confronted by traditional communication.However,most research prototypes of semantic communication are aimed at a single modality at present,which is limited its practicality,and semantic communication systems for image sources typically utilize low-level or segmented semantic information,while the system using high-level semantic information to reconstruct images remains unknown.To address these issues,this thesis proposes an Image-to-Text-to-Image(I2T2I)cross-modality semantic communication system with the task of transmitting images.By utilizing the text semantic information of image,the image compression technique is shifted from traditional pixel-level reconstruction to semantic-level reconstruction based on deep learning.The specific research contents and innovations as follows:1.To enhance the cross-modal semantic extraction ability of I2T2 I cross-modal semantic communication system,the Image-to-Text(I2T)cross-modal semantic encoder is first investigated.The encoder employs a bilinear attention mechanism to construct a multi-head attention module and introduces an optimized attention network to address the problem that traditional attention mechanisms only focusing on low-level feature interactions of the image,resulting in incomplete generalization of semantic information for images.The simulation experiments show that the proposed I2T crossmodal semantic encoder in this thesis can comprehensively understand the semantic information of image content compared with other models in various evaluation metrics.2.To promote the capability of reconstructing semantic information in the I2T2I cross-modal semantic communication system,the Text-to-Image(T2I)cross-modal semantic decoder is first investigated.The decoder can alleviate the phenomenon of competition between multiple generators in the current stack-based text-to-image generation models,resulting in semantic deviation issues for the generated images.Specifically,this thesis designs a multi-information fusion module that can fully utilize the advantages of each generator to ensure the high quality of the final generated image.Secondly,a new text-image similarity model is proposed in this thesis,which incorporates position information encoding and fully considers the similarity information of adjacent encoding in the image to minimize the loss of semantic information in the image.The experimental results show that the proposed T2I crossmodal semantic decoder in this thesis has better semantic understanding ability than the baseline model and can reconstruct images with high fidelity.3.The current semantic communication prototypes mainly focus on single modality and the semantic communication systems that take images as sources pay more attention to low-level or segmented semantic information.To address the issue,this thesis proposes an I2T2I cross-modal semantic communication system that utilizes the text semantic information to reconstruct images.To reduce the data transmission overhead,only the word index in the vocabulary is transmitted when transmitting the text semantic information of the image.Secondly,to ensure the semantic similarity between the generated and input images,this thesis designs a semantic consistency module using a siamese network and a contrastive loss function to measure the semantic consistency problem between the input and reconstructed images.In addition,transfer learning is applied in the system by using pre-trained models of I2T cross-modal semantic encoder and T2I cross-modal semantic decoder to achieve end-to-end training.Finally,experimental results verify the effectiveness and feasibility of the proposed I2T2I cross-modal semantic communication system.

Keywords/Search Tags:

cross-modal semantic communication, semantic consistency, end-to-end communication, transfer learning, deep learning

PDF Full Text Request

Related items

1	Semantic Transfer Hashing Based On Deep Learning For Cross-modal Retrieval
2	Research On Cross-Modal Retrieval Of Image And Text Based On Deep Learning
3	Research On Cross-modal Retrieval For Semantic Consistency Learning
4	Research On Cross-Modal Retrieval Based On Deep Semantic Analysis
5	Research Of Multi-label Cross-modal Semantic Hashing Image-text Retrieval
6	Research On Controlled Semantic Embedding And Deep Mutual Information For Cross-modal Hashing
7	Cross-modal Event Retrieval Based On Deep Semantic Learning
8	Research On Semantic Consistency And Matrix Factorization Based Cross-Modal Hashing Retrieval
9	Research On Deep Cross-modal Retrieval Algorithm Based On Representation Learning
10	Cross-modal Retrieval Based On Deep Learning