Font Size: a A A

Knowledge Scene Graph And Topic Correlation Graph For Image Captioning

Posted on:2024-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z W WuFull Text:PDF
GTID:2568307103974939Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image captioning generation is a crucial task in computer vision.Scene graph is commonly used to caption the attributes and relationships of objects in images.The image captioning generation method based on scene graph can utilize the semantic information contained in scene graph to the process of image and natural language conversion.However,the traditional method of scene graph generation has limited effectiveness since it does not incorporate prior knowledge and ignores the potential relationships between objects in real life.Additionally,current image captioning methods do not fully utilize the topic information conveyed by the image in the image generation process,which makes it challenging to highlight the topic.This thesis aims to address these issues by conducting the following research work:(1)In order to address the issue of inadequate representation of complex object relationships in existing image captioning models,this thesis presents a novel method based on knowledge scene graph.Firstly,the open domain knowledge extraction tool(Open IE)is utilized to construct the knowledge graph which can provide the semantic connection between objects;secondly,the knowledge graph is embedded into the scene graph generation process,allowing the model to learn more complex relationship expressions between objects,additionally,a sorting algorithm is used to optimize the embedding process.,and at the same time,a new adaptive attention mechanism is introduced to reduce the influence of non-visual words on the model by adjusting feature fusion ratio;finally,the knowledge graph and scene graph are encoded by the encoder-decoder network model,and the image captioning is generated by the decoder.Experiments on multiple datasets have demonstrated that this method can caption images more accurately,particularly in handling complex relationships between objects.(2)In order to solve the problem that the captioning generated by existing models is difficult to highlight the image topic,this thesis proposes an image captioning generation method based on topic correlation graph.Firstly,the LDA topic model is utilized to learn the image topic information from the dataset caption;secondly,a topic correlation graph is constructed,which displays the information clearly and provides guidance for the generation of a scene graph.The similarity between the topic words is calculated by Wasserstein distance and used as edge weight to construct the topic correlation graph;thirdly,the topic correlation graph features are fused during the generation of the scene graph,resulting in a scene graph that highlights the image topic;finally,the network model encodes and decodes the scene graph,generating captioning that includes image topic information.The experimental results on multiple datasets show that this method is effective in generating topic information captions containing images.(3)The proposed image captioning generation method in this thesis has been applied to the development of an image captioning generation system.The system is designed with the principles of comprehensive rapid development,rapid response,and simple operation in mind.It includes two main functions: image upload and image captioning generation.Users can easily upload their images to the system and generate captions that highlight the image topic,based on the proposed method.The system has been tested and evaluated on multiple datasets,showing its effectiveness in generating accurate and topic-specific image captions.
Keywords/Search Tags:image captioning, knowledge graph, scene graph generation, topic word, adaptive attention mechanism
PDF Full Text Request
Related items