With the continuous development of technology and social progress,it is an inevitable trend for robots to gradually enter human life in order to liberate productivity and improve people’s living standards.However,the important premise for mobile robots to complete various complex tasks is to have good autonomous navigation ability.Therefore,facing increasingly complex work scenarios and dynamically changing surrounding environments,The indoor mobile robot navigation algorithm is a very important and challenging task.This paper mainly studies the indoor interactive navigation algorithm based on deep reinforcement learning.By constantly interacting with the environment,the mobile robot can continuously enhance its perception of the external environment and learn efficient interaction methods with the surrounding environment.Traditional SLAM algorithms often assume that the environment is stationary in indoor navigation problems.However,in real indoor scenes,there are often interactive objects that block the robot’s path.In this case,compared with traditional navigation algorithms,navigation algorithms based on deep reinforcement learning can continuously adjust their own navigation strategies based on the external dynamic environment and have higher robustness and stronger adaptability.The main research contents of this paper are as follows:1.This paper proposes an algorithm framework for robot indoor interactive navigation.We first model the problem of indoor interactive navigation as a partially observable Markov decision process(POMDP)and define the observation space and action space of the robot based on the Locobot model.To solve the POMDP,we analyze and design two state encoders to process historical observation information,including a state encoder based on a recurrent neural network(RNN)and a state encoder based on self-attention mechanism.This paper proposes a self-attention-based state encoder that,under the action of a masking matrix,can be trained and updated in parallel,with higher computational efficiency and stronger ability to capture temporal features compared to RNNs.2.This paper analyzes and attempts to alleviate several issues faced by deep reinforcement learning algorithms for visual navigation,including low sampling efficiency,unstable learning,sparse rewards,and difficulties in optimizing reinforcement learning algorithms with high-dimensional inputs based on image inputs.To address these issues,the paper designs and uses an off-policy distributed RL framework to improve sampling efficiency,a theory of image transformations to reduce the estimation bias of Q-value,and various reward functions to solve the sparse reward problem.The paper also proposes a way to predict the optimal next waypoint to provide sufficient supervision signals for robot state representation learning,especially for optimizing RL algorithms using self-attention mechanisms with high-dimensional inputs.Finally,based on the visual navigation modeling and optimization algorithms proposed above,we propose a complete framework and training process for reinforcement learning-based indoor interactive visual navigation algorithms.3.This paper conducted extensive simulation experiments on the IGibson,a fully interactive navigation platform,and verified the effectiveness of the algorithms on public test sets.The experimental results show that our proposed indoor interactive navigation algorithm has stronger perception and more efficient decision-making ability than other benchmark algorithms and can navigate flexibly in complex indoor scenes.In addition,the algorithm also has good robustness and generalization ability in unknown dynamic environments. |