Research On DQN (Deep Q-Network) Algorithm In Complex Environmen

Posted on:2024-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Shi

Full Text:PDF

GTID:2568307106481884

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Reinforcement learning is a branch of machine learning that aims to enable agents to learn optimal behavior through interaction with the environment.Traditional reinforcement learning algorithms use a tabular form to record action-value functions,but this approach is limited when dealing with high-dimensional inputs.To address this issue,deep reinforcement learning has emerged by integrating deep learning and reinforcement learning,and replacing the traditional table with neural networks to achieve broader applicability.However,deep reinforcement learning algorithms perform poorly in complex environments.This thesis conducts in-depth research on this issue to improve the performance of deep reinforcement learning algorithms in complex environments.The research object selected in this thesis is the Doudizhu game,which has complex characteristics,such as requiring historical information to make decisions,sparse rewards,strong randomness,and a huge state-action space.These issues are also common challenges in production and life,and this thesis proposes solutions to address them.Firstly,to address the challenges of historical information incorporation and sparse rewards,this thesis proposes the DGQN and DGQN＿RLHF methods.DGQN uses a GRU neural network in the DQN algorithm to utilize the information from the previous state to enhance the algorithm’s performance.On the other hand,the DGQN＿RLHF method replaces the reward function in the DGQN method with human feedback training to solve the problem of sparse rewards during the training process.Secondly,to address the challenges of strong randomness and a large state-action space,this thesis proposes the Soft Dueling DQN algorithm based on maximum entropy and dueling networks.Maximum entropy is used as a measure of randomness,which is integrated into the DQN algorithm to enable it to learn a random policy even in a random environment.The dueling network is used to reduce the impact of the large state-action space on the algorithm by avoiding the agent’s focus on inferior actions.

Keywords/Search Tags:

Deep reinforcement learning, Maximum entropy, Human feedback, Complex environment

PDF Full Text Request

Related items

1	Research On Motion Planning In Dynamic Environment Based On Deep Reinforcement Learning
2	Deep Reinforcement Learning In Maximum Entropy Framework With Automatic Adjustment For Path Planning
3	Research On Inverse Reinforcement Learning Based On Maximum Entropy Theory
4	Robot Control Strategy In Complex Environment Based On Deep Reinforcement Learning
5	Research On Active SLAM Algorithm Based On Deep Reinforcement Learning In Complex Environment
6	Design And Implementation Of A Customer Feedback System Based On Maximum Entropy Model
7	Inverse Reinforcement Learning Algorithms In Semi-markov Environment
8	Research On The Maximum Weight Matching Problem Based On Deep Reinforcement Learnin
9	Social Navigation Planning Method For Mobile Robots
10	Research On Intelligent Anti-jamming Decision Technology Of Frequency-Hopping Communication