Font Size: a A A

Research On Image Captioning Models Based On High-Level Semantics

Posted on:2021-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:N X WangFull Text:PDF
GTID:2428330614960439Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The task of image captioning aims to generate natural language description for a given image,which is very challenging since it involvs both computer vision and natural language processing.In recent years,image captioning methods based on deep neural network have made great progress.However,those existing methods still cannot avoid the problem of inaccurate and unnatural caption caused by the lack of high-level semantics.Therefore,we studied methods and techniques of image captioning based on the high-level semantic information in this dissertation.The main works of this dissertation are as follows:(1)Although neural network-based encoder-decoder model can learn the relationships between the encoded image features and the decoded description relying on a large training set,some defects,such as semantic deficiency and semantic error,cannot be avoided completely.To address this problem,this dissertation makes some improvement on the basis of classical encoder-decoder model and designs a new image captioning model that combines high-level semantics regeneration,which detects the high-level semantic words by using of Faster R-CNN,and then integrates the high-level semantics into the neural network through the attention mechanism to regenerate the initial image caption.Experimental results show that the combination of high-level semantic information is helpful to improve image description.(2)Studies show that when people describe an image,language level is not the only dependence,some common sense knowledge,which is not included in the image obviously,is important too.However,those existing image captioning methods seldom use external knowledge fully.Therefore,a image captioning model based on high-level semantic information and external knowledge base is proposed in this dissertation.This model obtains related additional knowledge based on the high-level semantics of the image,and then integrates the obtained additional knowledge into the model through an attention mechanism to generate better image caption.Experimental results verify the effectiveness of the model.
Keywords/Search Tags:Image Captioning, Neural Network, Encoder-Decoder Model, High-Level Semantics, Attention Mechanism
PDF Full Text Request
Related items