Research On Scene Graph Generation Methods Based On Deep Learning

Posted on:2022-09-19

Degree:Master

Type:Thesis

Country:China

Candidate:J W Duan

Full Text:PDF

GTID:2518306539991899

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

The scene graph generation refers to detecting target from images and inferring their relationships,and using graph structures to represent the images.Scene graph is a bridge between natural language and computer vision,which has become a popular research field of image understanding in recent years.As a powerful tool for image understanding,deep learning has also been widely used.However,the existing scene graph generation methods still have two problems.Firstly,the diversity of relationships inferred by existing scene graph generation methods is limited.On the one hand,imperfect characteristics will limit the diversity of relationships.The existing methods simply use visual features for category reasoning,and the differences between similar relationships are small,which limits the diversity of relationships.On the other hand,the long-tailed distributions of the data set will limit the diversity of relationships.The sample size of common triplets occupies most of the data set,while the sample size of many uncommon relationships is small.Existing methods predict all similar relationships as common ones to increase the recall rate,which will hurt the diversity of relationships.Secondly,the existing scene graph generation methods have poor domain adaptability.They are all based on specific natural image data sets,most of which contain unique reasoning habits of the specific data sets.This reasoning habit limits the domain adaptability of the methods.For the first problem of limited relationship diversity,this work proposes a scene graph generation method based on global-semantic information assistance,called SGG＿G-SIA.Firstly,SGG＿G-SIA proposes to integrate the global statistical knowledge and semantic information provided by the data set into a global semantic coding and integrates it with visual features to represent targets and relationships,which can alleviate the poor relationship diversity caused by imperfect features.Secondly,SGG＿G-SIA uses the reprocessed global statistical knowledge to guide the inference of the target and relationship category,which can alleviate the poor relationship diversity caused by the long-tailed distributions.This measure can solve the long-tail distribution of the data set and increase the diversity of relationships.Finally,SGG＿G-SIA designs different networks to perform feature fusion and category reasoning on targets and relationships respectively.This measure can meet the aggregation needs of different information,and to make the module pertinent.For the second problem of poor adaptability in the network domain,this work proposes a scene graph generation method based on multi-modal fusion and counterfactual reasoning,called SGG＿MFCR.SGG＿MFCR fuses the information of the two modalities of vision and language into the predictive features of the relationship,providing richer information for the expression of relationships.After that,SGG＿MFCR adopts a counterfactual reasoning strategy to summarize the unique reasoning habits of the specific data set,and explicitly eliminates this reasoning habit during the test,so as to obtain a network that can fairly predict common and uncommon relationships and has good domain adapt capability.The SGG＿MFCR network trained on existing data sets can be directly applied to computer-generated images without relying on their annotations and secondary training.Finally,this work starts from the image description and semantic layout to generate computer-generated image sets.After that,it applies SGG＿MFCR to this image set to verify the domain adaptation of the network and generate robust scene graphs to assist people in understanding computer-generated images.The above methods have been comprehensively tested in this work.Experimental results show that the two methods proposed in this work can generate more robust scene graphs to describe images.Compared with the existing scene graph generation methods,SGG＿G-SIA has a significant improvement in the feature richness of targets and relationships and the diversity of relationships.SGG＿MFCR has significantly improved whether it is the feature richness of targets and relationships,the diversity of relationships,or domain adaptability.In a nutshell,the scene graph generation methods proposed in this work are superior to the existing scene graph generation methods in many aspects.Some of the results have been published as SCI journal papers.

Keywords/Search Tags:

Scene graph generation, deep learning, global semantic coding, counterfactual reasoning, limited relationship diversity, domain adaptability

PDF Full Text Request

Related items

1	Scene Graph Generation For Image Semantic Understanding And Represention
2	Research On Scene Graph Generation Method Based On Deep Learning
3	Visual Relationship Generation Based On Scene Understanding
4	Scence Graph Generation Based On Context
5	Research On Natural Language Question Generation Over Knowledge Graphs
6	Research And Implementation Of Scene Graph Generation Algorithm Based On Attention Mechanism
7	Research On Scene Graph Generation Algorithm Based On Causality
8	Research On Text-to-Scene Conversion Technology Based On Entity And Spatial Relationship Reasoning
9	Deep Learning Model Interpretation Methods Based On Counterfactual Image Generation
10	Research On Fine Granular Rich Semantic Image Subtitle Generation Method Based On Deep Learning