Research On Image Captioning Models Based On High-Level Semantics

Posted on:2021-01-23

Degree:Master

Type:Thesis

Country:China

Candidate:N X Wang

Full Text:PDF

GTID:2428330614960439

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The task of image captioning aims to generate natural language description for a given image,which is very challenging since it involvs both computer vision and natural language processing.In recent years,image captioning methods based on deep neural network have made great progress.However,those existing methods still cannot avoid the problem of inaccurate and unnatural caption caused by the lack of high-level semantics.Therefore,we studied methods and techniques of image captioning based on the high-level semantic information in this dissertation.The main works of this dissertation are as follows:(1)Although neural network-based encoder-decoder model can learn the relationships between the encoded image features and the decoded description relying on a large training set,some defects,such as semantic deficiency and semantic error,cannot be avoided completely.To address this problem,this dissertation makes some improvement on the basis of classical encoder-decoder model and designs a new image captioning model that combines high-level semantics regeneration,which detects the high-level semantic words by using of Faster R-CNN,and then integrates the high-level semantics into the neural network through the attention mechanism to regenerate the initial image caption.Experimental results show that the combination of high-level semantic information is helpful to improve image description.(2)Studies show that when people describe an image,language level is not the only dependence,some common sense knowledge,which is not included in the image obviously,is important too.However,those existing image captioning methods seldom use external knowledge fully.Therefore,a image captioning model based on high-level semantic information and external knowledge base is proposed in this dissertation.This model obtains related additional knowledge based on the high-level semantics of the image,and then integrates the obtained additional knowledge into the model through an attention mechanism to generate better image caption.Experimental results verify the effectiveness of the model.

Keywords/Search Tags:

Image Captioning, Neural Network, Encoder-Decoder Model, High-Level Semantics, Attention Mechanism

PDF Full Text Request

Related items

1	Research On Semantic-Attentive Deep Image Captioning Method
2	Research On Image Caption Method Based On High Level Semantic Extraction And Attention Mechanism
3	Image Captioning Based On Adaptive Visual Attention Mechanism
4	Image Captioning Based On Deep Recurrent Convlution Network And Spatio-temporal Information Fusion
5	Image Chinese Captioning Model Based On Deep Learning
6	Research On Image Captioning Based On Self-Attention And Encoder-Decoder
7	Research On Image Captioning Algorithm Based On Attention Mechanism
8	Research On Image Captioning Methods Based On Deep Learning
9	The Research Of Image Captioning Based On Multi-Attention Model And Copy Mechanism
10	Image Captioning Based On Generative Adversarial Network With Temporal Attention