Research On Multimodal Leaking For Graphic Travelogue Generation

Posted on:2019-07-24

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Fu

Full Text:PDF

GTID:2348330545458470

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Chinese tourism market have achieved great progress in recent years.With the development of dat a sharing,the concept of wisdom tour-ism is proposed.It become a major concern of induatry to make online travel sites more perfect with artificial intelligence.At present,most of the online travel website will provide users with colorful graphic travelogue.However,there are some problems with the travelogue of online travel website.First of all,for some online travelsites,there are a large number of tourism photo album whithout the travel notes.Second,quite a number of travel notes are short of visualization image.In the end,there are redun-dant image and travelogue for most of the same travel scenic spots.In view of the above problems of online travel sites,we carried on the research of the related studies.Compared with many multimodal learning tasks such as visual storytelling and image caption,we put forward the multimodal vis-ual storytelling model of deep learning to solve the problem.On the algorithm,to address potential problems of traditional visual storytelling model,we put forward three kinds of improved model,using the same convolution neural network as the input model of image features,the same language model as the input model of text features,using Bidi-rectional Long Short-Term Memory(BLSTM)network as the modal trans-formation structure,we build long-term memory visual storytelling model.On the basis of long-term memory visual storytelling model,we further introduce attention mechanism to construct attention visual storytelling model,and introduce adversarial training mechanism to construct adver-sarial visual storytelling model.Through extensive experiment on the Chi-nese and English datasets,we verify the validity of the three models,and adversarial visual storytelling model achieved the best results in all models.Introducing attention mechanism and adversary mechanism innovatively,this thesis constructs new learning paradigm for visual storytelling task.It is proved not only in terms of machine translation and image generation these mechanisms is effective,also in multimodal learning visual storytell-ing they have a very big potential.On the Engineering,this thesis,by using three kinds of improved model,oriented on the graphic travelogue writing application,build graphic travelogue generating system,and make a series of optimization on algorithm and the efficiency.The system is divided into offline visual storytelling training system and online travelogue writing system,it can be used to achieve travelogue writing,filtrating travelogue,and so on.

Keywords/Search Tags:

multimodal learning, visual storytelling, deep learning

PDF Full Text Request

Related items

1	Research And Applications Of Image-text Multimodal Correlation Learning
2	Research On Multimodal Machine Translation Method Based On Visual Information
3	Research On Emotion Recognition Method Based On Multimodal Deep Learning
4	Research On Multimodal Sentiment Analysis Method Based On Deep Learning
5	Eye Tracking Technique Based On Deep Multimodal Learning
6	Research On Multimodal Data Processing Algorithm Based On Deep Learning
7	Research And Application Of Multimodal Learning For Heterogeneous Feature Fusion
8	Multi-modal Information Fusion In Visual Question Answering
9	EEG And EOG-based Multimodal Vigilance Estimation Using Deep Learning Method
10	Multimodal Emotion Recognition Algorithm Based On Deep Learning