Research On Key Technologies Of Automatic Text Generation Based On Images

Posted on:2020-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:L B Mo

Full Text:PDF

GTID:2428330575456459

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,artificial intelligence has developed rapidly,and research in the intersection of computer vision and natural language processing has gradually attracted the interest of researchers.Most existing work focuses on the image captioning task,which is aimed to generate a single sentence description from a single image.This paper expands the dimensions of the input and output,and deals with generating paragraph descriptions based on image sequences,which is denoted as the visual storytelling task.Compared with image captioning,visual storytelling poses a greater challenge to the intersection of CV and NLP.It not only requires understanding each image in the visual sequence and the context between the images,but also guaranteeing the coherence of generated paragraphs.First of all,this paper explores the visual storytelling algorithm based on deep learning.Aiming at the shortcomings of current visual storytelling task in image stream modeling and text generation,the first Chinese dataset of this task is constructed,and a retrieval architecture RST-Att based on multi-modal space mapping is proposed.On the one hand,RST-Att model builds a bidirectional long short-term memory network,and incorporates the Attention Mechanism to improve the modeling ability of image streams in different scenes.On the other hand,this model introduces the Rhetorical Structure Theory in the field of linguistics to improve the coherence of generated texts.In the part of experiments,this paper uses both Chinese and English datasets.The results show that RST-Att achieves better performance than the baseline models.Furthermore,faced with the same task,unlike the retrieval method,this paper explores the generative method and proposes an adversarial learning network AAL.As a generative method,AAL constructs a reward model instead of the learning principle of maximum likelihood estimation,and generates rewards to optimize the model.In addition,this paper proposes a new granularity of text generation,that is,generating the paragraph at the level of sense group to improve the coherence of the generated text.In the experimental part,this paper designes several comparison experiments,better results are abtained by the AAL model in both the automatic evaluation metrics and human evalution than the baseline models.Finally,this paper uses the visual storytelling algorithm proposed above to comprehensively develop a travel notes generation system,to automatically generate relevant travel notes based on the photo streams uploaded by users.The system mainly includes a data acquisition module,a travel notes generation module,a background manage ment and front-end display module.

Keywords/Search Tags:

visual storytelling, neural network, attention mechanism, adversarial learning, sense group

PDF Full Text Request

Related items

1	Research On Deep Visual Domain Adaptation Based On Adversarial Learning
2	Research On Visual Tracking Based On Multi-attention Convolutional Neural Network
3	Research On Object Tracking By Attentive Adversarial Network
4	Research On Group Recommendation Algorithm Based On Attention Mechanism And Graph Neural Network
5	Research On Image Semantic Understanding Based On Attention Mechanism
6	Group Activity Recognition Algorithm Research Based On Attention Mechanism And Deep Learning Network
7	Research On Deep Learning Algorithm For Sequence Data
8	Research Of Knowledge Graph Embedding Adversarial Learning Method Based On Attention Mechanism
9	Adversarial Examples Defense Method Based On Parallel Attention Mechanism
10	Research And Implementation Of Group Recommendation Model Based On Neural Network