| In recent years,smart phones have been more and more popular,and the image data generated by users has increased dramatically.At the same time,with the development of mobile Internet,users are more and more accustomed to sharing pictures and videos online.So it is more and more obvious that image comprehension has great application value.While obj ects are the core building blocks of an image,it is often the relationships between obj ects that determine the holistic interpretation.Therefore,visual relationship detection is a key step in image comprehension and an important bridge connecting computer vision and natural language processing.At present,visual relationship detection has become a research hotspot in the field of computer vision.Firstly,this thesis introduces the basic theory and common models of deep learning,which has become the most important method for feature extraction in the field of computer vision.Secondly,this thesis uses convolutional neural network to construct a relationship network capable of detecting subject,predicate and object simultaneously,and obtains a new relationship detection model by merging the relationship network with Region Proposal Network.The main advantage of this model is that it can achieve end-to-end training and prediction.Finally,a proposal generation algorithm based on relationship information is proposed in this thesis proposes.Existing proposal generation algorithms only focus on single proposal,but pay less attention to the correlation between them.In this thesis,the relationship information is introduced into proposal generation algorithm to prevent it from generating unreasonable region proposals,thus improving the recall rate of the algorithm.Through the experiment on the Visual Relationship dataset and comparison with existing algorithms,the superiority of the proposal generation algorithm based on relationship information is verified,and it is proved that relationship information can improve the recall rate of proposal generation algorithms. |