Research And Improvement Of Model-free Reinforcememnt Learning Based On Online Policy

Posted on:2023-02-12

Degree:Master

Type:Thesis

Country:China

Candidate:Q Gao

Full Text:PDF

GTID:2568306914477304

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Reinforcement learning,as a branch of machine learning,trains agents to interact with the environment to obtain the strategy that maximizes the cumulative reward.Deep learning has powerful function approximation ability and representation learning characteristics,which can help reinforcement learning to make decisions in highdimensional and complex scenarios.Deep reinforcement learning combines the perception ability of deep learning with the decision-making ability of reinforcement learning to achieve end-to-end intelligent decision making.Deep reinforcement learning has been widely used in a variety of control problems,such as recommendation systems,resource management,robot control,game AI,etc.The free-model algorithm based on online policy is one of the main types of deep reinforcement learning.It refers to the algorithm that the agent obtains data through real-time interaction with the environment and learns policies directly from the data.The quality of the data obtained by an agent during its interaction with the environment determines the level of strategy it can learn.In the process of interaction,the method of exploring the environment of an agent determines the quality of interactive data,thus affecting the learned strategies.Only sufficient exploration can acquire good strategies,but excessive exploration will lead to a slow training process.In order to realize better exploration of model-free deep reinforcement learning algorithm based on online policy,so that agents can learn better strategies.In this paper,the fuzzy DQN algorithm based on stable exploration is firstly proposed.The algorithm introduces parametric noise network,and uses it to add noise to the parameter field to increase the exploration ability of the agent.At the same time,combining with fuzzy theory,the state action value is processed by fuzzy neural network,which makes the agent explores environment more table and can quickly convergence during the training process.Furthermore,a fuzzy DQN algorithm based on feature fusion is proposed.While extracting high-dimensional features of states based on parametric noise network,fuzzy neural network is introduced to extract low-dimensional features,which makes feature extraction more sufficient,thus obtaining higher rewards and learning better strategies.Finally,an intelligent virtual environment is designed and constructed to verify the exploration ability of the two algorithms in different state action spaces.

Keywords/Search Tags:

Deep Reinforcement Learning, Noisy Network, Exploration and Exploitation, Fuzzy Neural Network

PDF Full Text Request

Related items

1	Optimization Method For Reinforcement Learning Based On Overestimation Control And Exploration Enhancement
2	Research On Reinforcement Learning Algorithm Based On Improved Action Decision Method
3	Exploration Strategy Of Deterministic Policy In Deep Reinforcement Learning
4	Research On Efficient Exploration In Reinforcement Learning
5	Robust Policy Gadient Algorithm Based On Actor-Critic In Deep Reinforcement Learning
6	Research On Deep Reinforcement Learning Algorithm For Continuous Action Control
7	Research On Exploration-Incentivized Robust Deep Reinforcement Learning
8	Based On Deep Reinforcement Learning And Neural Network On XSS Attack Detecion Technology
9	Research On Network Traffic Anomaly Detection Based On Deep Reinforcement Learning
10	Study Of Fuzzy Neural Network Control Based On Reinforcement Learning And Its Application