Actor-Critic Reinforcement Learning And Applications To Automatic Ship Berthing

Posted on:2022-08-14

Degree:Master

Type:Thesis

Country:China

Candidate:H R Zhang

Full Text:PDF

GTID:2492306563475324

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the increasing complexity of the tasks carried out by ships,higher requirements for the automatic control system of underactuated ships have been put forward.Automatic berthing system is an indispensable part of realizing efficient and safe navigation.With the development of unmanned surface vessels technology,it is of great practical significance to establish an efficient and accurate intelligent automatic berthing system.Reinforcement learning methods have become a hot research direction in the field of artificial intelligence due to their potential to solve complex control and decision-making problems.Reinforcement Learning from Demonstration(RLf D)methods,which combines reinforcement learning and imitation learning,can improve the speed and stability of agent training process through the data provided by various expert policies.Although RLf D has a good practical application prospect,it is necessary to deal with the distribution mismatch problem at the same time.Aiming at the automatic berthing problem of underactuated surface vessels,this paper designs two RLf D methods that combine Actor-Critic and model predictive control.The theoretical and simulation results show that the proposed methods have good convergence and can effectively solve the problem of distribution mismatch.The simulation results also show that the learning speeds of the RLf D algorithms are more than half faster than that of the typical model free Actor-Critic algorithm.The contributions and innovations of this paper are as follows:(1)In view of the problem of underactuated ship automatic berthing,the problem is fomulated as Markov decision making process on the basis of mathematical modeling of ships.The scheme of reinforcement learning is designed to solve the problem.And model-free Actor-Critic algorithms are applied to ship berthing problem.Simulation results show that the reinforcement learning method can complete the automatic berthing task without relying on the mathematical model information and motion planning.(2)Aiming at the slow convergence speed of model-free Actor-Critic algorithm,a RLf D method combined with model predictive control is proposed.In order to solve the problem of insufficient expert data and sub-optimal expert policy,an interactive expert controller with Actor-Critic combined with model predictive control is designed,which can provide the agent with expert data and improve the performance synchronously with the agent’s learning.Aiming at the distribution mismatch problem of the proposed RLf D method,two improvement schemes are proposed on the basis of theoretical analysis.Simulation shows the effectiveness of the proposed RLf D method and its variant methods.Compared with the model-free Actor-Critic method,the training speed is accelerated and the learning efficiency is improved.(3)Reformulating the original reinforcement learning problem into a constrained optimal control problem through theoretical analysis.Based on the RLf D method,the SGAC algorithm is proposed.The algorithm utilizes an agent to interact with the environment,and the expert is only responsible for providing expert guidance online.In the training phase,the dual gradient method is used to solve the optimization problem,and the convergence of the proposed method is theoretically analyzed.The test is carried out in the established automatic berthing simulation environment.The simulation results show that the SGAC algorithm can solve the distribution mismatch problem,and the learning process is more stable and the convergence speed is faster.Compared with the model-free Actor-Critic algorithm,the berthing trajectory obtained by SGAC is smoother.

Keywords/Search Tags:

Actor-Critic Methods, Automatic Ship Berthing, Reinforcement Learning from Demonstration, Distribution Mismatch Problem

PDF Full Text Request

Related items

1	Study On Adaptive Pid Control Strategy Based On Actor-critic Learning
2	Research On Deep Reinforcement Learning Methods For Solving Flowshop Scheduling Problem
3	Asynchronous Generalized Advantage Actor-critic And Application In Automatic Driving
4	Research On Electric Vehicle Routing Problem Based On Reinforcement Learning
5	Research And Implementation Of Actor-Critic Algorithm Model For Aircraft Autonomous Landing
6	A Resequencing Method For Automobile Painting With Rework Based On Reinforcement Learning
7	Research On Bidding Strategy Of Generators In Electricity Market Based On Asynchronous Advantage Actor-Critic Reinforcement Learning
8	Research On Reinforcement Learning Control Method Of Micro Air Vehicle
9	Research On Flexble Job Shop Scheduling Problem Based On Deep Reinforcement Learning
10	Research On Automatic Power Generation Control Based On Multi-agent Transfer Reinforcement Learnin