Paragraph Image Captioning Based On Convolutional Neural Network

Posted on:2021-03-15

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Liang

Full Text:PDF

GTID:2518306308969259

Subject:Intelligent Science and Technology

Abstract/Summary:

PDF Full Text Request

The task of paragraph image captioning task aims to generate a descriptive paragraph for a given image.As an important research direction of cross-media intelligence,it connects two significant areas:computer vision and natural language processing.The research progress of this task is very important to break the semantic gap between images and text.In recent years,thanks to the sequence modeling capabilities of the RNN(Recurrent Neural Network)family,hierarchical RNN decoders have been widely used in paragraph image captioning.However,the limitations of the RNN structure make such methods have the following problems.First,due to the limited ability to capture long-term information,RNN is difficult to generate long paragraphs.The generated paragraphs are not such coherent.In addition,the serial structure of RNN results in higher training time complexity and lower efficiency.Inspired by the characteristics of the CNN(Convolutional Neural Network),we conduct the following works.A paragraph decoder based on a fully convolutional neural network is proposed.The gated structure is incorporated into a hierarchical CNN decoder,which has stronger long-term memory capabilities and the ability to train in parallel.A metric to measure the coherence of paragraphs is proposed.We perform the experiments of evaluation metrics,coherence metric,training time complexity and subjective analysis on the Stanford image-paragraph dataset.We come to the conclusion that our decoder improves the quality of the generated paragraphs.Dual-CNN,a paragraph image captioning model that integrates regional attention is proposed,in order to enhance the image comprehension ability.A Metric for measuring the diversity of sentences in a paragraph is proposed.Through the experiments of evaluation metrics,diversity metric,regional attention analysis and subjective analysis,Dual-CNN significantly improved the performance of the paragraph image captioning task.

Keywords/Search Tags:

deep learning, convolutional neural network, paragraph image captioning, coherence

PDF Full Text Request

Related items

1	Image Paragraph Captioning Based On Tree Structures
2	Research On Image Captioning Algorithm Based On Deep Learning
3	Research On Image Paragraph Captioning Method Based On Deep Learning
4	Research On Image Captioning Algorithm Based On Deep Learning
5	Research And Application Of Image Paragraph Captioning Based On Relations Encoding And Attention Mechanism
6	Image Captioning Based On Deep Recurrent Convlution Network And Spatio-temporal Information Fusion
7	Image Chinese Captioning Model Based On Deep Learning
8	Researches On Short Video Captioning Based On Deep Learning
9	Research And Application Of Image Captioning Based On Deep Neural Network
10	Study On Image Captioning Based On Deep Learning