| As one of the important research directions in the field of decision-making,adaptive decision-making has many important research results.However,since the actual environment of an agent is often dynamic and uncertain,it is difficult to predict the environmental change in advance,so the adaptive behavior for environmental change is difficult to determine.Therefore,in the face of uncertain and unpredictable environment,the agent must be able to formulate the corresponding adaptive behavior strategies online according to the changes of the environment at that time.The adaptive decision-making ability of an agent in an unknown environment determines its degree of intelligence.How to deal with the uncertainty of the environment and demand through adaptive decision-making,traditional control methods can no longer meet the needs of engineering practice.In response to such problems,this paper uses the strong perception ability of deep learning and the efficient decision-making ability of reinforcement learning to solve the problem of adaptive decision-making of the agent.The deep reinforcement learning algorithm trains the agent to make it summarize experience in the process of interaction with the environment,so as to form its own understanding of specific behavior applications.Because it is difficult for deep reinforcement learning algorithms to do entity training in real environment,this paper takes the Unmanned Aerial Vehicle(UAV)mission in the simulation environment as the carrier to study the application of deep reinforcement learning Soft Actor-Critic(SAC)algorithm in the adaptive decision-making problem of the agent,and the SAC algorithm has been improved for the problems in the training process.As an efficient model-free algorithm,SAC can meet the needs of robots to acquire skills through learning in complex environments.Firstly,this paper introduces the research significance of solving the problem of agent adaptive decision-making based on deep reinforcement learning algorithms,expounds the research status of deep reinforcement learning and agent adaptive decision-making at home and abroad.Introducing deep learning and some classic reinforcement learning algorithms,and then leading to the deep reinforcement learning algorithm.Discussing the concept of deep reinforcement learning algorithm and several classic deep reinforcement learning algorithmsSecondly,the simulation environment is built based on Pygame,TensorFlow,Python and other software platforms.At the same time,PyCharm is used as the development tool to verify the effectiveness of related algorithms.The UAV anti-interception simulation environment built in this paper is derived from a secret-related project scenario.It is mainly the problem that our UAV starts from the starting airport,during the reconnaissance process along the way,the UAV needs to break through the interception of the opponent’s missile and land on the target airport smoothly,this paper establishes its kinematics model for UAV,making it closer to the real scene.Then,the SAC algorithm is applied to the UAV anti-interception task.Aiming at the problems in the training process,the SAC algorithm is improved to improve the adaptive decision-making ability of the agent.This paper starts with improving the experience replay strategy,combining the SAC algorithm with the Prioritized Experience Replay(PER)and Emphasizing Recent Experience(ERE)strategy,and proposes the SAC+PER,SAC+ERE and SAC+PER+ERE algorithms.By changing the "sampling strategy" in experience replay strategy of SAC algorithm,the learning efficiency and convergence speed of the algorithm are improved,and makes the algorithm more stable.Finally,the effectiveness of the algorithm is verified by simulation examples.The original SAC algorithm is compared with the SAC+PER,SAC+ERE and SAC+PER+ERE algorithms proposed in this paper,which proves that the improved algorithm has higher learning efficiency and better robustness. |