Font Size: a A A

Reinforcement Learning With Consensus And Event-trigger

Posted on:2019-02-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:W X ZhangFull Text:PDF
GTID:1318330566962440Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Reinforcement learning is a kind of model-free machine learning method,which seeks to solve problems via trial and error,and has characteristics of simple structure and strong adaptability.Reinforcement learning shows great potential in the field of machine game,robot autonomous navigation and market decision-making and so on,and provides a feasible way to solve the bottleneck problem of knowledge acquisition for intelligent systems.This dissertation focuses on the key problem of how to improve the learning efficiency,and studies from the three aspects of the communication method between agents,the algorithm structure and search strategy method of reinforcement learning.The theory that links the decentralized partially observable markov decision processes,multi-agent reinforcement learning and local communication is established,and the algorithm is to lay the foundation for event-triggered reinforcement learning.Aiming at the locality and uncertainty of observations in large-scale multi-agent application scenarios,the model of Decentralized Partially Observable Markov Decision Processes(DEC-POMDP)is considered,and a multi-agent reinforcement learning algorithm based on consensus protocol is proposed.For a distributed learning environment,the elements of reinforcement learning are difficult to describe effectively in local observation situation,and the learning behavior of each individual agent is influenced by its teammates.The consensus protocol is utilized to approach agreement on the global observing environment,and thus that a part of strategies generated by repeating observations are eliminated.Considering the characteristic of limited perception of the agent,and the perceptual ability varies with the space,a weight coefficient of observation is introduced to evaluate the credibility of observation.The simulation results show that the degree of consensus is improved by the weight coefficient,the learning strategy space is reduced,and the learning process is accelerated.To tackle the problem of communication and computing resource consumption,a novel event-triggered multi-agent reinforcement learning algorithm is proposed.Traditional reinforcement learning algorithms require periodic communication and search strategy,resulting in unnecessary communication and computing resources consumption.The proposed algorithm defines a triggering function according to the change rate of observation,the communication and strategy search are executed intermittently,thus that the whole learning process is aperiodic.The simulative results show that the cost of communication is reduced,and the computational consumption are relieved.For some learning problem with lower requirement for the convergent rate,it is an advisable way to reduce the consumption of communication and computing resources at the expense of the convergent rate.Aiming at the problem that how to balance the relationship between the search scope and the learning speed in the heuristic reinforcement learning algorithm.A type of heuristic reinforcement learning algorithm base on event-triggered is proposed.In order to improve the accurate degree of the prior knowledge of the heuristic learning algorithm,an event-triggered discrimination method of the priori knowledge is designed.For the heuristic learning process,obtaining priori knowledge from its own experience is a design method of the heuristic function.Accuracy of the prior knowledge has a great influence on the searching speed and the quality of the solution.In the proposed algorithm,the triggered function is designed by the change rate of learning steps and Frobenius-norm of Q table,then the acquisition process of prior knowledge changes into flexible method from fixation method.Furthermore,aiming at the problem of the learning speed is improved at the expense of the strategy search range is constrained,an event-triggered heuristic reinforcement learning algorithm is proposed.The triggering function is designed according to the change of the observation information.The agent can selectively inspire the learning process,and thus that the learning speed is accelerated while the searching range is expanded.The simulative results show that the priori knowledge is obtained from experience more efficiently,and the relationship between the optimal strategy and convergence speed is balanced.
Keywords/Search Tags:multi-agent system, distributed reinforcement learning, heuristic reinforcement learning, event-triggered, consensus protocol
PDF Full Text Request
Related items