Font Size: a A A

Visual Relationship Detection Based On Multi-feature Fusion And Short-Term Memory Selection Network

Posted on:2020-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y P PengFull Text:PDF
GTID:2428330572496560Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since the success of AlexNet in competition of ImageNet in 2012,deep learning has achieved great improvement in image classification,target detection and segmentation,reaching or exceeding the human recognition level.On this basis,further study of image understanding has become the trend.As an intermediate task of object detection and image understanding,visual relationship detection has received more and more attention in recent years and has become one of the research hotspots in computer vision.The goal of visual relationship detection is to identify all the<subject-predicate-obj ect>triples from the image,and mark the position of the subject and obj ect.It can be divided into three sub-tasks:predicate detection,phrase detection and relationship detection.Compared with tasks such as image classification and target detection,the relationship between objects is more abstract,so how to effectively represent the visual relationship between objects in natural images is a challenge.In recent years,researchers have proposed visual relationship detection methods based on techniques such as language prior,statistical dependence and knowledge representation learning.They use the visual,positional and/or semantic features of objects to detect relationships.However,on the one hand,these methods do not fully acquire the features that effectively characterize the visual relationship between objects,and on the other hand,do not consider the interrelationship between different kinds of features,so the detection performance is unsatisfactory.Aiming at the above problems,this paper explores the visual relationship between objects in images,and studies the visual relationship detection methods from two perspectives:multi-feature fusion and multi-feature correlation.The main work is as follows:1)A relationship detection method based on multi-feature fusion is proposed.Firstly,the visual features of each obj ect and the positional features between the obj ects are extracted by CNN,and the semantic features of each object are extracted by using the word vector.Then,a two-level feature fusion strategy is used to fuse the three types of features,so that the features can be related to each other and the relationship can be better characterized.Finally,the visual relationship is classified based on the multi-features fusion.By comparing the experiments on the public dataset VRD and VG,the proposed method outperforms the deep relational network(DR-Net)and deep structure learning(DSL)methods on three different subtasks of visual relationship detection.2)A visual relationship detection method based on Short Term Memory Selection Network(STMS)is proposed.On the basis of the visual relationship representation of multi-feature fusion,the visual relationship detection model is established by using the short-term memory selection mechanism.The feature of the union region of subject and object are taken as the initial state,subject and object as input,then the model stimulate the union region through subject and object and output the visual relationship.The advantage of this model is that it not only makes full use of the feature of the union region,but also remove the unimportant background information using the powerful reasoning ability of the neural network,thereby achieving the purpose of improving the detection performance.The comparison experiments on the public dataset VRD and VG show that the proposed method is 3%higher than state-of-the-art in relationship detection.The comparison of other subtasks also proves the validity of the proposed short-term memory selection network.
Keywords/Search Tags:visual relationship detection, visual features, semantic features, spatial features, short term memory selection
PDF Full Text Request
Related items