Font Size: a A A

Vision-and-Language Navigation Based On Knowledge Graph

Posted on:2022-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:X F WuFull Text:PDF
GTID:2518306725978809Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Vision-and-Language Navigation(VLN),as a natural navigation task in a human interaction environment,requires an agent to understand natural language instructions,analyze the information contained in the vision,make behavioral decisions based on its own state and navigate according to natural language instructions.The difficulty of VLN is that the agent must correctly understand complex and fuzzy natural language instructions,accurately identify and align objects in an open environment.Uncertainty in an unseen environment poses a huge challenge to the location judgment and behavior decision-making of the agent.This paper introduces knowledge graph into VLN,puts forward Vision-and-Language Navigation based on Knowledge Graph and uses prior knowledge to improve the accuracy of agent navigation in the environment.Firstly,we set up a knowledge graph for navigation task,which is processed by Graph Convolutional Network and integrated into VLN,so that the agent can infer from prior knowledge in language or visual part.Secondly,two VLN models based on knowledge graph are proposed and implemented:Graph Cross-Modal Reasoning Navigator(GCMRN)and Attention Graph Convolutional Network Cross-Modal Recognition Reasoning Navigator(AGCMR2N).Finally,in order to ensure the validity of the AGCMR2N model,Attention Graph Convolutional Network for Zero-shot Learning(AGCNZ)is proposed and tested in the Zero-shot Learning(ZSL),which proves the validity of the Attention Graph Convolution Network based on the whole model.In the experimental study,in order to make the experimental environment closer to the real life,we carry out detailed experimental tests in a continuous 3D environment,which cancels many assumptions implied in the previous work and enhances the feasibility of the algorithm application in the actual device.The experimental results verify the validity and feasibility of the Vision-and-Language Navigation model based on knowledge graph.
Keywords/Search Tags:Vision-and-Language Navigation, Graph Convolution Network, Knowledge Graph
PDF Full Text Request
Related items