Font Size: a A A

Buoyancy Control Of Underwater Gliding Snake-like Robot Based On Reinforcement Learning

Posted on:2020-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhangFull Text:PDF
GTID:2518306353964449Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Underwater gliding snake-like robot is a new kind of bionic robot which combines underwater glider and bionic underwater snake-like robot.Because of this combination,it not only has the gliding gait of underwater glider with low energy consumption and noise,but also has the snake gait of bionic snake-like robot with good maneuverability and flexible action.However,due to the unknown hydrodynamic model,most model-based controllers or linear PID controllers can not effectively solve the complex control problems of underwater vehicles.In this thesis,a model-free reinforcement learning algorithm is selected to study the control policy automatically.Three kinds of gliding motion of underwater gliding snake-like robot controlled by buoyancy change are studied by using reinforcement learning algorithm.The buoyancy control of the robot is realized,and the training process of policy function is accelerated by improving the algorithm.In order to realize the effective training control policy of underwater gliding snake-like robot gliding motion using reinforcement learning algorithm.Firstly,a special virtual environment for reinforcement learning is built to facilitate the training of control policy of underwater gliding snake-like robot.The virtual environment includes inaccurate hydrodynamic model and the necessary interface of reinforcement learning algorithm,and a two-dimensional visual interface is compiled.This thesis studies the effective combination of control and reinforcement learning algorithm for underwater gliding snake-like robot,finds out the corresponding elements of reinforcement learning for the action and feedback information of robot control,and extracts and generates the elements of reinforcement learning algorithm respectively,aiming at the considerable characteristics that common generation methods will encounter.Measuring problem,dimension disaster problem and practical device application problem are separately designed,and some heuristic functions are added to accelerate the realization of control objectives.In the process of gliding motion of the robot using buoyancy control method,aiming at the generality and convergence of the algorithm,a reinforcement learning algorithm based on value function iteration and direct policy iteration is adopted successively.The direct policy iteration algorithm is obtained in the buoyancy control of the robot through the simulation experiments of two kinds of algorithms.More practical conclusion.The classical Monte Carlo policy gradient algorithm is selected for the direct policy iteration method,and the algorithm is improved in two stages.Firstly,the policy function of the algorithm is fitted by the neural network,and the state input is preprocessed,and the Monte Carlo policy gradient method is proposed.Through the training effect of the algorithm in the simulation experiment,it is concluded that the preprocessed Monte Carlo policy gradient algorithm can accelerate the convergence of the algorithm and improve the problem brought about by partially observable state.Then,in order to make the algorithm more general,the cyclic Monte Carlo policy is proposed combining with the cyclic neural network of long-term and short-term memory units.Gradient algorithm,through the training of three kinds of action algorithm,proves that the algorithm makes the control strategy closer to a Markov decision-making process,greatly improves the training speed,and achieves good results for the three kinds of gliding actions,which verifies the effectiveness of the new cyclic Monte Carlo policy gradient algorithm.
Keywords/Search Tags:Underwater gliding snake-like robot, reinforcement learning, motion control, Markov Decision Process, model free
PDF Full Text Request
Related items