Font Size: a A A

Research On FPGA Hardware Acceleration Platform For Reinforcement Learning

Posted on:2020-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:L QinFull Text:PDF
GTID:2428330596476227Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
In recent years,the field of artificial intelligence has been continuously developed.As its core algorithm,deep reinforcement learning combines the deep learning technology with its perceptual ability and the reinforcement learning technology with its decisionmaking ability.It has been widely used in the field of industrial manufacturing,robot control,simulation and game.Deep reinforcement learning is a computationally intensive algorithm.At present,the popular hardware framework of training deep reinforcement learning is CPU+GPU,but the high-power consumption of the GPU makes it difficult to deploy in mobile devices.FPGA is a kind of programmable logic device with low power consumption,configurability and rich computing resources.It is suitable for computing devices in deep reinforcement learning.Deep Q Network(DQN)algorithm is a significant algorithm in deep reinforcement learning.It uses a kind of neural network to perceive the environment.In addition,DQN algorithm uses the method of experience pool and target network to stabilize the training.With afore-mentioned background,our paper uses the hardware framework of CPU+FPGA to implement the training of DQN algorithm.This topic uses the PYNQ platform,which is based on the CPU+FPGA hardware framework and uses the Python language library to cal programmable logic resources,which is suitable for the training of deep reinforcement learning algorithms.On this basis,this topic analyzes the DQN algorithm carefully,and uses the Vivado HLS tool to design three IP modules for the accelerated computing of experience pool and the target network method,which are action network,evaluation network and target network.Then we integrate them into the system overlay.Finally,we use the Python language in the Jupyter Notebook development environment to implement the complete training of the DQN algorithm.The experimental results show that the DQN algorithm implemented under the PYNQ platform can solve the ‘CartPole' task successfully and get approximate the highest reward after about 300 episodes.The estimated power consumption of the implementation is only 1.74 W.Compared with the same algorithm implemented in the CPU and the GPU,the efficiency reached 70.5 times that of the CPU and 4.3 times that of the GPU,which verifies the feasibility of low-power and high-efficieny implement of deep reinforcement learning designed by this topic.
Keywords/Search Tags:deep reinforcement learning, hardware acceleration, PYNQ, FPGA
PDF Full Text Request
Related items