Font Size: a A A

Multi-agent learning and coordination algorithms for distributed dynamic resource allocation

Posted on:2005-01-27Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Vengerov, DavidFull Text:PDF
GTID:2458390008991319Subject:Operations Research
Abstract/Summary:
The main problem addressed in this thesis is that of coordinated learning among multiple agents operating in a common environment. This general problem is motivated by a practical concern of distributed dynamic resource allocation among multiple agents with fluctuating demands for a common resource. Two examples of this problem are considered in this thesis: dynamic data or thread migration in computer networks and dynamic bandwidth sharing among multiple interacting wireless transmitters.; In order to address the above problems, a new general multi-agent coordination framework is proposed---Spreading Impact Coordination (SIC), which is suitable for agents operating jointly in stochastic dynamic environments. In this framework, each agent receives a signal---partial information about the global situation, which it uses as a new state variable. The agent then learns to interpret and use this information in the context of its local state. This approach provides a dynamic balance between a fully centralized and a fully distributed control.; Reinforcement learning (RL) is a standard framework for a single agent to learn optimal state-based policies in stochastic dynamic environments, and Q-learning is the most commonly used RL algorithm. As an alternative to Q-learning, this thesis presents a new reinforcement learning algorithm, Actor Critic Fuzzy Reinforcement Learning (ACFRL-2). ACFRL-2 extends Q-learning to domains with very large (and possibly continuous) state-action spaces, which arise in the wireless transmitter application domain. A convergence proof for ACFRL-2 is also presented.; In order to test the effectiveness of the SIC framework in stochastic dynamic environments, two architectures are presented in this thesis: one combining SIC with Q-learning and one combining it with ACFRL-2. The first architecture has been applied to the problem of distributed dynamic load balancing in content distribution networks, and the second architecture has been applied to the problem of dynamic bandwidth sharing through power control in multiple interacting wireless transmitters. Both applications have shown very promising results, and their general nature indicates that the SIC framework in conjunction with the presented reinforcement learning algorithms can be successfully applied to a great variety of practical problems.
Keywords/Search Tags:Dynamic, Problem, Reinforcement learning, Agent, Among multiple, Framework, Coordination, Resource
Related items