| With the development of the marine economy,more and more countries are paying attention to marine science and technology.As a large marine country with a vast coastline,our marine strategy has already become an important part of the national strategic system.Autonomous underwater vehicles(AUVs)are the essential equipment for ocean exploration and development,and have been successfully applied to various underwater tasks,such as seafloor mapping,pipeline repair,and field source search and so on.The AUV can not complete the underwater mission without a good motion control system,so the study of its motion control strategy is of great significance.However,the underwater environment is complex and changeable,and the AUV system has the characteristics of high nonlinearity,strong coupling,and model uncertainty,so the research of AUV motion control is still challenging.Reinforcement learning(RL)is a kind of learning algorithm in which the agent accumulates experience according to the feedback of the environment.The learning characteristics of RL provide an effective way to make AUV have certain self-learning and self-adaptive ability.Therefore,the motion control of AUV based on the RL algorithm is deeply studied in this paper.The details are as follows:(1)An adaptive model-free optimal RL neural network trajectory tracking control method based on filtering error is proposed.Underwater navigation needs huge energy consumption,so it is necessary to consider the optimal characteristics in the design of the AUV trajectory tracking control method.An adaptive model-free optimal RL neural network control method based on filter error is proposed for AUV trajectory tracking control with saturation constraint.In order to solve the problem that the Hamilton-Jacobi-Bellman(HJB)equation of AUV dynamics is difficult to solve,an RL strategy based on an actor-critic framework is proposed to approximate the solution of the HJB equation.In addition,an optimal controller design method based on filtering errors is proposed for the first time for the AUV,a system with second-order dynamics modeling in the form of strict feedback,to simplify the controller design and speed up the response of the system.The theory is strictly analyzed and proved,and its effectiveness is verified by a simulation example.(2)An adaptive RL point to point motion fault-tolerant control(FTC)method based on integral extended state observer(IESO)is proposed.The complex and changeable underwater environment lead to many unpredictable faults of the AUV.It is not feasible to repair or replace the faulty thrusters immediately during AUV operation.Therefore,it is necessary to consider the fault-tolerant ability in the controller design phase.An RL FTC method is proposed for the point-to-point motion control of AUVs with thruster faults.To deal with the thruster fault,unknown disturbance and model uncertainty,a new IESO for fault diagnosis observation is proposed,which uses a traditional ESO to estimate the total system uncertainty,and introduces an integral mechanism to further mitigate the effect of estimation error,thereby improving the problem that the fault-tolerant capability of the FTC system is reduced due to the estimation error caused by the traditional ESO.Furthermore,based on the actor-critic structure of RL,a PD-like feedback controller is designed to realize the FTC of AUV in the face of thruster fault by using an accurate estimation of the total uncertainty by the IESO scheme.It is also strictly analyzed and proved in theory,and its effectiveness is verified by a simulation example.(3)The AUV underwater experimental platform is built to verify the designed control method.Aiming at the problem that the existing AUV is difficult to re-develop,the AUV is redesigned and assembled in hardware and software,and the AUV underwater experimental platform is built.Then,the usability of the built underwater experimental platform is verified by simple tests,including the tests of depth control,yaw angle control,and horizontal fixed-point trajectory tracking control,and the test results meet the experimental requirements.Finally,the effectiveness and superiority of the two control algorithms are verified based on the AUV underwater experimental platform.In this paper,the RL-based optimal control and fault-tolerant control methods of AUV are deeply studied to improve the navigating ability and reliability of the AUV system.The research results are not only applicable to the AUV system but also have an excellent theoretical reference for the motion control research of other systems. |