Font Size: a A A

Research Of Visual Scene Understanding Algorithm Based On Deep Learning

Posted on:2021-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y M LiFull Text:PDF
GTID:2428330614460384Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Semantic scene understanding is a key issue in the field of computer vision.It is the main tool for computers to perceive the real world by simulating human visual functions.With the wide application of deep learning in the field of computer vision,major breakthroughs have been made in object detection and instance segmentation,but highlevel semantic scene understanding tasks such as image description and visual question answering(VQA)still need further exploration and research.As a semantic description of images,scene graphs have exemplified the promotion of high-level semantic scene understanding tasks on many tasks.With the hard work of researchers,the scene graph generation task has also achieved considerable development.However,in the real world,complex visual information still presents many challenges to current scene graph generation methods,such as how to use the correlation between objects in the scene,and it is difficult to label the relationship between all objects in the real scene in the data set.These The problems limit the performance of the scene graph generation method in real scenes.The difficulty of the scene graph generation task is how to use the context information of the objects and relationships in the image,and how to deal with the impact of the deviation of the data set annotation on the model.This thesis proposes a context-based scene graph generation method.This method obtains a comprehensive object representation by fusing object position information,semantic information and visual features,and improve the scene graph generation accuracy by adopting bidirectional long-short-term memory network(Bi-LSTM)to encode context information and modeling structured prediction method using conditional random field(CRF).In view of the deviation of data set labeling,this thesis proposes a zerosample relationship prediction method,which guides the prediction of invisible care categories by remembering the hierarchical semantic information of the network and object categories,to get rid of the limitation of the scene graph generation method of the dataset.
Keywords/Search Tags:Scene Graph, Deep Learning, Structural Prediction, Zero-shot Learning
PDF Full Text Request
Related items