Font Size: a A A

Reinforcement Learning Technology Optimization For Heterogeneous Multi-agent Game Confrontation

Posted on:2023-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:K J WanFull Text:PDF
GTID:2568306791481624Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,with the improvement of computing power and the growth of data volume,artificial intelligence has gradually become a popular research direction in computer science.As the artificial intelligence method closest to the human learning process,reinforcement learning is currently one of the most interesting star fields.As the intersection of game theory and artificial intelligence,multi-agent reinforcement learning is the most cutting-edge research direction.It has been widely used in academia and industry,such as robots,games,and recommendation systems.However,there are still many scientific and industrial challenges to be overcome in multi-agent reinforcement learning before it develops into mature AI technologies such as face recognition or text classification.In the training of multi-agent reinforcement learning,because the reward signal returned by the environment is too sparse and the training efficiency is low,the amount of data required for training is huge,resulting in high hardware requirements and time costs for training;and existing algorithms do not consider To the heterogeneity between agents,which is actually an important factor in the multi-agent game confrontation problem;finally trained reinforcement learning models are often overfit to specific tasks,resulting in a lack of generalization of the model,making the model Not stable when applied to different scenarios.Aiming at the heterogeneity problem in the multi-agent game confrontation problem,this paper proposes the idea of grouping,and divides the agents into different populations according to the characteristics of the observation space and action space,so as to model and formalize the heterogeneous problem.For a marginal optimization problem,this optimization problem is solved by alternating maximization theory and its convergence and local optimality are proved,which provides a new perspective for understanding and promoting the relationship between heterogeneous agents.On this basis,this paper also proposes an efficient two-stage heterogeneous fusion iterative method,which modifies the existing model by fine-tuning to quickly adapt to heterogeneous tasks to improve efficiency,and trains different agents sequentially through the iterative method.group until the algorithm converges.Aiming at the generalization,this paper proposes a method of state modeling and feature extraction.The state vector is modeled as a special matrix independent of the number of agents,and valuable features are extracted from the matrix through a convolutional network.In addition,a death mask technique is used to avoid the influence of the dead agent on the calculation of the loss function.Finally,this paper conducts extensive experiments on different maps of the Star Craft platform,and the results show that the proposed method performs better than the SOTA algorithm in difficult heterogeneous multi-agent tasks and has better generalization.
Keywords/Search Tags:reinforcement learning, multi-agent, heterogeneity, generalizability
PDF Full Text Request
Related items