Font Size: a A A

Path Planning And Hunting Control For Multi-Agent Systems Based On Reinforcement Learning

Posted on:2024-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z L FanFull Text:PDF
GTID:2568307136451514Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
From autonomous driving and Alpha Go to the recently exploded Chat GPT robot,the development of artificial intelligence is overwhelming.Reinforcement learning and deep learning are only two examples of artificial intelligence algorithms that have drawn a lot of interest.The multi-agent system represents a significant advance in agent technology research and application in the field of control.Agents efficiently accomplish many complex practical tasks through perception,communication,cooperation,and coordination.With automation and intelligence becoming the mainstream of the times,multi-agent collaboration has a broad prospect of application.The multi-agent cooperative control problem has important significance for research.One of the primary research contents of agent control is path planning.Planning an optimal or better path is the basis for the successful completion of the task.The target hunting problem of multiagent systems is a kind of multi-agent formation control problem.Target hunting is increasingly used in both civilian and military fields.Many scholars seek more advantageous and intelligent solutions around the path planning and hunting problems of multi-agent systems.(1)In an unknown environment,a path planning method for multi-agent formation based on improved Q-learning is proposed with the aim of the formation and path planning of multi-agent systems.Based on the leader-following approach,the leader agent uses an improved Q-learning algorithm to plan the path and the follower agent achieves a tracking strategy of gravitational potential field(GPF)by designing a cost function to select actions.To improve the Q-learning,Q-value is initialized by environmental guidance of the target’s GPF.Then the virtual obstacle-filling avoidance strategy is presented to fill non-obstacles which is judged to tend to concave obstacles with virtual obstacles.Besides,the action selection strategy is improved by the simulated annealing(SA)algorithm,whose controlling temperature is adjusted in real time in accordance with the learning situation of the Q-learning.The experiment indicates that when compared with the traditional algorithm,the improved Q-learning algorithm reduces the convergence time by 89.9% and the number of convergence rounds by 63.4%.With the help of the method,multiple agents have a clear division of labor and quickly plan a globally optimized formation path in a completely unknown environment.(2)Aiming at the target surrounding motion of multi-agent systems,a target hunting control method based on deep reinforcement learning is proposed.First,the Markov game modeling for multi-agent systems is done.The model control is combined with the deep deterministic policy gradient algorithm(DDPG).The potential energy function is designed from the perspective of multi-agent collaboration.The improved DDPG algorithm guided by the potential energy model is established for the target hunting.Secondly,based on the existing potential energy model,the target-tracking hunting strategy and the target-circumnavigation hunting strategy are established.In the former,the consensus tracking of multiple agents is achieved by designing the velocity potential energy function.And in the latter,virtual circumnavigation points are added to construct the potential energy function,which realizes the desired circumnavigation.The stability of the target hunting strategy is analyzed by automatic control theory.Finally,the simulation verifies that the target hunting can be successfully achieved using the proposed reinforcement learning method.(3)An event-triggered target hunting method of reinforcement learning is proposed to address frequent communication and updates in the process of hunting.Based on the improved DDPG target hunting method and adding the event-triggered mechanism to the strategy aspect,the study is conducted by automatic control theory.The event-triggered conditions are designed by considering both formation position and velocity.An eventtriggered tracking hunting strategy and an event-triggered circumnavigation hunting strategy are presented to avoid continuous communication.The stability analysis is conducted,and the Zeno phenomenon is proved to be non-existent,respectively.Experimental verification is also conducted,and it is finally concluded that the proposed event-triggered hunting strategies can successfully achieve hunting and save resources by reducing the number of communications and triggers.
Keywords/Search Tags:reinforcement learning, multi-agent systems, path planning, target hunting, event triggering
PDF Full Text Request
Related items