Font Size: a A A

Research On The Conversion Of Image-text Algorithm Based On Deep Learning

Posted on:2021-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ZhangFull Text:PDF
GTID:2428330602973058Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of deep learning technology,there are more and more researches on image and text,which are closely data forms related to people.In recent years,there is a trend of fusion for the research of image and text,that is,not only for the research of image or text but also to put them into the same field for research.There are two related research areas,one is the field of the image to text(I2T),the other is the field of the text to image(T2I).In the field of the image to text(I2T),multimodal recurrent neural networks(m-RNN)was first proposed in 2014,and the breakthrough is obtained by combining recurrent neural networks(RNN)with convolutional neural networks(CNN).This structure also laid the basic structural framework of the I2 T field.In the field of the text to image synthesis(T2I),deep convolutional generative adversarial networks(DCGAN)combined with RNN and CNN proposed in 2016 have achieved the end-to-end text to image synthesis for the first time.Although many kinds of research in the following two fields have made great progress,it is still a great challenge to improve the quality of the respective results further.The whole subject mainly studies the algorithm of the image and text conversion,which includes two parts: image to text and text to image.The main goal of the subject is to improve the performance of I2 T and T2 I corresponding models further and promote the development of image and text conversion further.The whole subject achieves this goal through two schemes.The first scheme is the independent improvement learning scheme,which is to improve in their respective fields to achieve the corresponding performance improvement.The independent improvement learning scheme has achieved more excellent results in the field of dense captioning in the image to text,and its performance has surpassed the state-of-the-art methods.This scheme improves the clarity and authenticity of the synthetic results in the field of text to image further and promotes the development of the field of text to image further.The second one is a unilateral dual learning scheme,that is,the model of I2 T and T2 I is put into one structure and trained by a unilateral dual learning method to improve the performance of I2 T and T2 I model.Experiments demonstrate that the scheme is feasible and effective in improving the performance of the I2 T and T2 I model,and better results are achieved in both fields.Through these two schemes,arbitrary conversion between image and text is realized,and all the experimental results show the effectiveness of the two schemes in obtaining higher quality conversion results.Overall,the research on this subject not only promotes the research process of I2 T and T2 I but also lays a solid foundation for the future application of I2 T and T2 I models.
Keywords/Search Tags:Image to text, Text to image, Deep learning, Generative adversarial networks
PDF Full Text Request
Related items