Font Size: a A A

Research On Generalization Problem In Reinforcement Learning Based On Information Bottleneck

Posted on:2023-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y W SunFull Text:PDF
GTID:2558307154976629Subject:Engineering
Abstract/Summary:PDF Full Text Request
Deep reinforcement learning has flourished in recent years,while it still has many problems,and it performs poorly when implementing practical problems.The most severe test is the avalanche of effects of reinforcement learning agents when facing new environments that they have not seen during training.We call this phenomenon the generalization problem in reinforcement learning,and it can lead to many serious consequences.For example,after the agent is trained,it only remembers the action sequence instead of learning the basic rules,resulting in the failure to generalize to reality.Thus,overcoming such obstacles becomes urgent and critical.The main research contents of this thesis are as follows:(1)This thesis summarizes the existing open source generalization reinforcement learning environment,and divides it into three different types according to its definition of generalization problems.The proposed solutions are also summarized into four categories,including regularization,representation learning,multiple Markov decision processes and training strategy optimization.Then the implementation of specific methods and their motivations are listed in detail.Our recommendations are given according to the basic rules.(2)From the perspective of representation learning in reinforcement learning,this thesis demonstrates that deep neural networks,which is function approximations of reinforcement learning,plays an important role in improving generalization capabilities.Then,the neural network structure is deepened and widened to show that the generalization potential of it has not been fully utilized,and it is necessary to improve the generalization performance of the neural network.(3)Based on information bottleneck,this thesis proposes ADNet to enhance the generalization ability of state-based reinforcement learning algorithm,and proves its effectiveness according to the information bottleneck theory.Meanwhile,through a lot of experiments and detailed analysis of Gym generalization environment,it is proved that ADNet can significantly improve the generalization ability.(4)In this thesis,the path planning environment of UAVs based on ocean current changes is constructed.SAC algorithm is used to realize path planning in complex ocean environment,and optimization is carried out from the perspective of state space and reward function.In addition,this thesis realizes the control of environmental generalization through the change of target point and the ratio change of ocean current,and proves that ADNet can improve the generalization ability in a variety of different environments.
Keywords/Search Tags:Reinforcement learning, Generalization problem, Information bottleneck, Deep neural network, path planning of UAVs
PDF Full Text Request
Related items