Font Size: a A A

Scene Graph Generation Method Based On Relation Visual Attention Mechanism

Posted on:2021-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:S W WangFull Text:PDF
GTID:2518306050470464Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of deep learning,some computer vision tasks have made great progress.However,there are still many problems and challenges to further understand the semantic information contained in images.The scene graph generation task is used to describe the relationship between the objects in the image and construct the scene semantic information,which can provide guidance information for the image description,visual questions and answers and other tasks.It is because of this that it becomes the key link of image interpretation research.This paper explores and innovates scene graph generation technology from three aspects:(1)A method of scene graph generation based on relational visual attention mechanism is proposed.This method is aimed at the problem that there is no representational characteristic representation corresponding to the relation in the image and that the region concerned by the relation representation learned by the existing method is not corresponding to the location of the relation.The so-called non-representational representation of the relationship means that the location of the relationship changes and is not fixed in different images,so there is no specific and explicit visual representation corresponding to the relationship.This paper according to there is no representational representation of relation,and mining the interactive information between two objects,proposes the relation visual attention mechanism.In this mechanism,the subject and object of the two objects in the relational pair are used for information interaction.By iterating the information of subject and object alternately,we can learn the inner relation between the two objects,obtain the representation of the relationship,and use it to detect the final relationship.Through visualization and comparison with the existing methods,the results show that the relationship representation learned by this method can focus on the area where the relationship between two objectives occurs and has advantages in the evaluation result.(2)A scene graph generation method based on the selected target relation is proposed,and its validity is verified by combining with the existing methods.For the object in the image,not all object pairs are likely to have relationships,so by selecting the object pairs that are likely to have relationships in advance and eliminating invalid combinations,the efficiency of the model will be improved to a great extent.In this paper,two different methods of target relation selection are established: The relation selection method based on relation bias,a relationship selection dictionary is established by analyzing the data set.The method based additive pooling,a deep learning model was established to select the relationship of autonomous learning.The comparative experiments show that the two methods have their own preference,but they can screen the invalid relationship in advance and reduce the invalid detection in relationship prediction.(3)A scene graph generation method based on relationship constraint loss is proposed.In order to solve the problem that the attention region of relationship representation is scattered in the scene graph generation method based on relation visual attention mechanism,two loss functions are proposed: the relationship constraint loss function and the object box constraint loss function.According to the region of relation between two objects exists in the contact part,the relationship constraint loss function is proposed,which is used to constrains the relationship representation concentrate more on the region where the relation actually occurs.At the same time,in order to enhance the constraint effect of relationship constraint loss,the object box constraint loss function is proposed to measure the difference between the intersection of the predicted object boxes and the annotation,and to produce a more accurate scene graph.Experimental results of ablation and comparison with other methods verify the effectiveness of the method.
Keywords/Search Tags:Scene Graph Generation, Relationship Representation, Relation Visual Attention Mechanism, Relation Selection
PDF Full Text Request
Related items