Hardware Implementation And Application Of Reinforcement Learning Algorithm For Online Decision

Posted on:2021-02-10

Degree:Master

Type:Thesis

Country:China

Candidate:X P Li

Full Text:PDF

GTID:2428330611498116

Subject:Instrumentation engineering

Abstract/Summary:

PDF Full Text Request

Online decision-making is a way for intelligent entities to make autonomous decisions without human intervention.It has broad application prospects in military and civil fields such as drone maneuvering decision-making,robot control,and car unmanned driving.Compared with traditional decision-making algorithms such as expert systems,decision-making algorithms based on deep reinforcement learning have online learning capabilities and can achieve end-to-end perception and decision-making,so they have received more attention in applications.However,because deep reinforcement learning is computationally intensive,GPUs are often used for algorithm training when applied,which makes it difficult to apply to end-side systems with limited computing resources and low power requirements.To this end,this topic is oriented to online decision-making applications,carrying out hardware implementation and application research of FPGA-based reinforcement learning algorithms.The main research work is as follows:1.Focusing on the design requirements,the overall research plan was determined through the analysis of the typical deep reinforcement learning algorithm Deep Q-Network(DQN)algorithm structure and FPGA computing resource evaluation.The main contents include: proposed an algorithm hardware implementation architecture based on the idea of software and hardware collaborative computing,completed the decomposition of the algorithm acceleration task,determined the design method of the algorithm hardware accelerator using the flow computing structure,and clarified the algorithm application verification method.2.Based on the algorithm hardware implementation archi tecture and acceleration task decomposition scheme,the design of the DQN algorithm hardware accelerator is completed.The DQN algorithm hardware accelerator is the core research content in hardware implementation.For the DQN algorithm hardware acceleration process,there are both network inference and training computing characteristics,while considering the data dependence and access bandwidth in parallel computing,etc.,from inside to outside Design idea,completed the specific design of the accelerator operator unit,calculation module and control module in the accelerator,and packaged and simulated the whole to facilitate the implementation of hardware for different decision-making applications.3.Aiming at the designed DQN algorithm hardware accele rator,the design space of parallel computing parameters is explored.Combining the characteristics of FPGA resources and the neural network structure of the DQN algorithm,the resources and calculation time consumed by the accelerator are modeled and analyzed to explore the best parallel computing parameters for the application of the accelerator.Afterwards,the accelerator was integrated into the system in the form of an IP core,and the scheduling design of the hardware implementation of the DQN algorithm was completed.4.The two applications of inverted pendulum control decision and UAV ground attack maneuver decision were verified.The verification work mainly includes four parts: application analysis,application environment modeling,accelerator parameter exploration and optimization,and performance analysis.The test results show that the design is correct and meets the design requirements in terms of decision time and design power consumption.At the same time,the training time and power consumption are compared with the CPU platform and the GPU platform.The test results show that the FPGA has certain advantages in training time and power consumption.

Keywords/Search Tags:

Online decision, Reinforcement learning, Hardware Acceleration, Inverted pendulum control decision

PDF Full Text Request

Related items

1	The Study About Control Of The Inverted Pendulum Based On Reinforcement Learning
2	Control Of The Inverted Pendulum Based On Reinforcement Learning
3	Relay Balance Control Of Double Quadrotors Inverted Pendulum
4	Research On Application Of Reinforcement Learning In Swing-up And Balance Control Of Inverted Penduum
5	The Control Of The Inverted Pendulum Based On Reinforcement Learning
6	Study On Tracking Control And Simulation Of Inverted Pendulum
7	Design Of Intelligent Control Algorithm And Realization In Inverted Pendulum
8	Improvement And Applications For Q-learning Reinforcement Learning Algorithms
9	Research On Inverted Pendulum Control Algorithm Based On Reinforcement Learning
10	Stability Control Of Inverted Pendulum System