Font Size: a A A

Research On Multi-agent Reinforcement Learning Algorithm And Its Equilibrium Realization Path Under Repeated Game

Posted on:2024-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q W QinFull Text:PDF
GTID:2530307130469994Subject:Mathematics
Abstract/Summary:PDF Full Text Request
In multi-agent systems,the existence of multiple agents leads to mutual influence and interference when making decisions,so how to achieve coordination between agents and achieve predetermined goals has always been a hot topic in game theory and artificial intelligence research.One important method is to use reinforcement learning to enable agents to learn autonomously.However,designing reinforcement learning algorithms for multi-agent coordination in multi-agent systems faces many problems,such as non-stationarity of the environment and huge joint state space dimension.This paper combines single-agent reinforcement learning algorithms to design multi-agent reinforcement learning methods,allowing agents to achieve mutual coordination and mutual learning under unknown game structures and limited information sharing,and ultimately achieve optimal joint strategies.Meanwhile,the effective implementation path of the game coordination equilibrium point will be obtained by analyzing the learning path of the designed algorithm through numerical experiments and game theory.Based on a two-agent,two-action repeated game,we first propose a cooperation learning algorithm based on gradient ascent,called CL-IGA.The CL-IGA algorithm is a way to improve the strategy by taking partial derivatives through the expected reward function,with the aim of making agents learn the optimal joint strategy.However,the analysis of the CL-IGA algorithm and its equilibrium implementation path reveals that there are still some areas that need improvement,such as the need for agents using the algorithm to know other agents’ strategies and rewards,which cannot be applied in unknown game environments.Therefore,we combine reinforcement learning algorithms,relax their constraints,and propose the CL-PGA algorithm.The CL-PGA algorithm can make agents coordinate with each other and achieve optimal joint strategies by only needing to know the average reward.Finally,we conducted numerical simulations on several classic game models using CL-IGA and CL-PGA algorithms and obtained effective implementation paths for the coordination equilibrium of these classic game models through experimental results.Previous research on multi-agent learning often focuses on the case of a small number of agents,and research on any number of agents and actions is still limited,lacking relevant theoretical analysis methods.This paper designs an algorithm called MRPL for mutual coordination between agents in this situation.In an unknown environment,agents using the MRPL algorithm improve their strategies by observing the common rewards of all agents and combining past experience,and eventually converge to the optimal joint action.By proving that when the optimal joint action components in the game are unique,then the optimal joint action achieved by agents is asymptotically stable point.At the same time,effective implementation paths for the coordination equilibrium of two types of classic games were obtained with the help of simulation experiments.To make the designed algorithm more applicable and able to be used in multi-stage repeated games,we extend the MRPL algorithm to stochastic games and compare it with the other two algorithms in two simulation experiments.The experimental results show that the learning performance of the MRPL algorithm is better than the other two algorithms in these two stochastic games.
Keywords/Search Tags:Multi-agent reinforcement learning, Multi-agent system, Repeated game, Stochastic game, Collaborative learning
PDF Full Text Request
Related items