Research And Implementation Of Multimodal Algorithm For Path Decision-making In Visual Language Navigation System

Posted on:2022-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:J R Zou

Full Text:PDF

GTID:2518306497452124

Subject:Master of Engineering

Abstract/Summary:

Visual language navigation is a cross-modal task integrating computer vision and natural language processing.This task requires the model to be able to convert and process the information of images and natural language in two different formats,obtain the information,and complete the navigation task in the simulated real 3D environment.At present,most relevant studies tend to improve the performance of visual language navigation models by better processing images and natural language information or improving navigation algorithms,while ignoring the possibility that intelligent robots can obtain more information from the environment.In data set for this task of natural language instruction after analysis,we found that the regional information for a considerable proportion in the natural language instruction,each instruction average appeared twice about regional information vocabulary,combined with navigation in accordance with the instruction of practical experience in our life,this paper presents the use of auxiliary navigation area information model.The region information model proposed in this paper integrates the current region information obtained from the image and the next region information predicted according to the natural language instructions.The cross-modal information is processed as a priori information to assist the navigation model training and navigation of the intelligent robot.After experiments on several open source visual language navigation models,it is found that using regional information to assist training and navigation can improve the success rate of navigation,especially the length of successful path,a key indicator of the task.At the same time,after adding regional information to the model,the performance of the navigation model in unfamiliar environment is also improved.At the same time,the research of visual language navigation task is mostly in English.On the basis of the existing results,this paper processed the data set in Chinese,and carried out the research of Chinese visual language navigation task,and got a good performance.

Keywords/Search Tags:

Visual language navigation, region information, reinforcement learning, cross-modality, Chinese navigation

Related items

1	Researches On Reinforcement Learning And Its Visual Navigation Application Techniques
2	Agent Navigation Based On Deep Reinforcement Learning
3	Robot Visual Navigation Algorithm Based On Deep Reinforcement Learning
4	Research On Visual Navigation Based On The First-person Perspective
5	Research Of Method For Visual Navigation Based On Deep Reinforcement Learning
6	Reinforcement Learning And Vision Navigation Based Mobile Robot Controlment
7	Robot Visual Navigation Algorithm Based On Deep Reinforcement Learning
8	Research And Application Of Multi-robot Obstacle Avoidance Navigation Based On Deep Reinforcement Learning
9	Research And Implementation Of Deep Reinforcement Learning Algorithm For Visual Perception And Navigation
10	Reinforcement Learning Robot Navigation Algorithm For Diverse Dynamic Environments