Research On Multimodal Emoticon Image Synthesis Based On Knowledge Metamodel

Posted on:2023-07-14

Degree:Master

Type:Thesis

Country:China

Candidate:X R Li

Full Text:PDF

GTID:2558306620454664

Subject:Software engineering

Abstract/Summary:

Emoticon image synthesis for social chat is a typical application in the field of cross-modal text-to-image.According to the user’s text description,a chat emoticon image that conforms to the semantic information is generated.The traditional chat emoticon image acquisition is obtained by keyword retrieval or matching in a predefined emoticon image library.The pre-defined emoticon images are fixed and limited,which cannot meet the needs of social users.In recent years,the method of text-to-image based on Generative Adversarial Network has been able to generate high-quality images that conform to semantic information,which provides a feasible technical direction for the generation of chat emoticon images.However,traditional text-to-image methods require a large number of training samples,while emoticon images do not have a dedicated dataset,and each type of emoticon image data is scarce;the current text-to-image methods can synthesize images with higher texture quality,but the synthetic image has semantic consistency is not ideal,while chat emoticon image synthesis pays more attention to semantic consistency.In addition,traditional methods have good performance in synthesizing single-object images,but are still challenging in complex scene images or multi-object images.In this thesis,small datasets,high semantic consistency requirements,and multi-object image synthesis are major challenges.In response to the above problems and challenges,a research on cross-modal emoticon image synthesis based on knowledge meta-model is proposed,which mainly includes the following works:(1)A dedicated dataset of chat expression images is proposed:At present,there is no chat emoticon image dataset.This thesis collects and sorts out the common emoticon images of social users,and divides them into three versions,and each version has a corresponding text description.This emoticon image dataset has been made public,providing data support for subsequent research on emoticon image synthesis.(2)A cross-modal emoticon image synthesis method based on knowledge meta-model is proposed:The method mainly includes four aspects: firstly,the intrinsic semantic relation of the chat emoticon image is established through the emoticon knowledge meta-model;secondly,the corresponding partial image is generated according to each knowledge element multi-generator.Thirdly,the multi-generator joint model generates a complete emoticon image based on semantic information and local images.Finally,a text semantic alignment module is added to improve the semantic similarity of the generated images.(3)A prototype application of expression synthesis is developed and made public:According to the cross-modal emoticon image synthesis method of the knowledge meta-model proposed in this thesis,a prototype application is developed,which mainly includes four modules: emoticon generation from free input,emoticon generation from prompt input,personalized emoticon and emoticon recommendation.

Keywords/Search Tags:

text-to-image, emoticon image generation, knowledge meta-model, cross-modal learning

Related items

1	Research On Cross-Modal Image-Text Retrieval Techniques Based On Semantics And Common Sense
2	Research On Cross-Modal Natural Language Generation
3	Research On Cross-modal Generation From Text To Person Image
4	Research On Cross Modal Image And Text Retrieval Methods Based On Pretraining Model
5	Research On Content Sifting And Storage Mechanism Of Cross-modal Image And Text Data Based On Semantic Similarity
6	Image-text Translation Based On Cross-modal Related Semantics And Attention Mechanism
7	Cross-Modal Retrieval Of Image-Text Based On Deep Learning
8	Research And Design Of Text-to-image Synthesis System Based On Cross-modal Correlation
9	Researches On Cross-Modal Learning Algorithms For Image-Text Retrieval
10	Research On Text Guided Image Generation Method Based On Adversarial Learning