Font Size: a A A

Design Of Neural Network Accelerator Based On RISC-V Processor

Posted on:2022-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiuFull Text:PDF
GTID:2518306542461804Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of computer hardware in the past decade which accelerated the process of machine learning algorithms from theory to practice.Deep learning networks based on machine learning have also begun to flourish.The emergence of deep learning networks has put forward higher requirements for computer hardware processing performance.Ordinary CPU processing power can no longer meet the requirements of deep learning.For this reason,a special hardware structure is needed to handle big data in deep learning.The emergence of RISC-V instruction set processors provides new ideas for designers who design hardware acceleration related to deep learning.Compared with the traditional instruction set architecture,RISC-V's instruction set design is simple.Its core is a basic ISA of RV32I.Based on RV32I,designers can customize the extended instruction subset according to their own needs.Secondly,RISC-V is a completely open source,free instruction set architecture,and has a strong software ecosystem.For this reason,this thesis designs a hardware structure for neural network acceleration based on the CPU of the RISC-V instruction set architecture.Firstly,this thesis analyzes the hardware accelerator structure from a theoretical level.Through analysis,the systolic array structure has many advantages when designed as a hardware accelerator.For this reason,the systolic array structure is selected for accelerator design.Combined with convolution commonly used in neural networks,the feasibility of using systolic array structure for convolution operation is analyzed.In terms of hardware design,mainly complete the basic unit PE design of the systolic array,and use multiple PEs to form the systolic array structure.Taking into account the reusability of the input data,when designing the PE module,double buffering is added inside it.The calculation result of matrix multiplication usually has a larger bit width than the input matrix.For this reason,a higher bit width accumulator is designed outside the systolic array.Design the AXI4 to SRAM module to complete the data communication between the RISC-V CPU core and the accelerator.In terms of testing and verification,build an accelerator module simulation platform based on RISC-V CPU.Design the corresponding accelerator test program,and then use the VCS simulator to carry out the functional test simulation of the software aspect.Design FPGA prototype verification platform,evaluate and test the designed systolic array accelerator and RISC-V CPU on FPGA platform.The test result shows that at 200MHz main clock frequency,compared with the Rocket CPU alone,the performance of Rocket CPU with accelerator structure is improved by about 71 times.Finally,based on the TSMC 40nm process library,the entire hardware design was synthesized using Design Compiler to generate the corresponding netlist.After synthesis,the total area of the entire SOC design is 4205320um~2.
Keywords/Search Tags:deep learning, hardware acceleration, RISC-V CPU, systolic array
PDF Full Text Request
Related items