Font Size: a A A

Research On Object Visual Relation Detection Algorithm

Posted on:2021-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2518306050964749Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the link between isolated objects in the image,visual relationship reflects the type of interaction between the objects,which is an important research in image understanding.In recent years,image classification and object detection have developed rapidly,the researchers began to explore more advanced semantic reasoning tasks.As an intermediate task of scene understanding,visual relationship detection can link computer vision with natural language and promote advanced computer vision tasks,such as image caption,visual question answering and image retrieval.The goal of the visual relationship detection task is to detect and locate the objects in the image and predict their relationships.The visual relationships,usually represented by subject-predicate-object triples,are used to describe the local area semantics of an image,such as"person ride bike".However,each relationship involves different objects pair combinations,and different objects pair combinations express diverse interactions.This makes the relationship detection,based just on visual features,becomes a challenging task.At present,most of the relationship detection approaches follow the pipeline of object detection,which generate the semantic features by extracting the union area features of the objects,and then treat the visual relationship as a classification problem.However,this rectangular region feature extraction approach assumes that the background is always useful for predicting the relationship types,which made the model unable to extract important information in some complex environments.Therefore,this paper designs and implements a visual relationship detection algorithm based on attention mechanism.In addition,this paper applies the graph method to the visual relationship detection task,and proposes a general network framework based on the graph neural network to detect the visual relationship by establishing the correlation relation between different object regions in the image.The main work and innovations of this paper are as follows:(1)Research on the visual relationship detection algorithm based on attention mechanism.This paper proposes a new network model of visual relationship detection,which is based on the object detection and uses object semantic reasoning and attention mechanism to improve the performance of the model.In order to overcome the influence of the visual appearance diversity,the algorithm combines semantic inference module and visual features.This paper designed two different attention mechanisms for object feature refinement and phrase feature refinement.In order to obtain the context information of each object,the object feature refinement module enhances the representation of each object by querying over other objects in the image.In addition,a phrase feature refinement module is proposed to make the model automatically learn and pay attention to the relevant image areas,so as to improve the performance of the model.(2)Research and implementation of visual relationship detection algorithm based on graph neural network.This paper proposes to apply the graph method to the scene graph analysis task,which is composed of visual relationships.In order to obtain the structured representation of the image,this paper designed external knowledge guidance and network learning methods to construct the relationship graph structure,which effectively deals with the quadratic number of candidate relations between objects in the image.The graph node combines visual and linguistic information and is represented by object visual feature and category word vector embedding.In the process of graph information propagation,a multi-head attention mechanism is added,so that the graph neural network can selectively capture the context information of each nodes.In the stage of visual relationship prediction,the phrase visual features and graph nodes are fused to predict the relationship types.Finally,we validate the proposed algorithms on the Visual Genome Relationship dataset.The visual relationship detection algorithm based on attention mechanism achieves competitive results compared to the state-of-the-art method MOTIFNET.It improves the previous state-of-the-art by 3.1%relative improvement on scene classification task,and improves baseline by an average 9.6%relative gain on three tasks.The visual relationship detection algorithm based on graph neural network provides a more general and effective network framework for future visual relationship detection task.
Keywords/Search Tags:Visual Relationship Detection, Attention Mechanism, Graph Neural Network
PDF Full Text Request
Related items