Font Size: a A A

Research And Design Of Convolutional Neural Network Accelarator Based On RISC-V Extended Instruction Set

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ChenFull Text:PDF
GTID:2518306737455644Subject:Materials Science and Engineering
Abstract/Summary:PDF Full Text Request
Convolutional Neural Network(CNN)has a wide range of applications in image recognition,path planning,motion detection and other fields,and has made great contributions to the rapid development of these fields.At present,most CNNs are implemented in practical applications using a central processing unit(CPU)or graphics processing unit(GPU)platform.However,actual applications have specific requirements for performance,cost,and power consumption.CNN has the characteristics of large data volume,large calculation volume,and no data feedback between the front and rear layers.At present,there is a CNN implementation scheme that is gradually emerging,that is,the use of Convolutional Neural Network Accelerator(CNNA)to achieve,this method has the advantages of low cost,high speed,low power consumption,etc.,so it has broad application prospects.By analyzing the characteristics of CNN algorithm,this paper designs a set of customized extended instruction set of Reduced Instruction Set Computing Five(RISC-V)and corresponding CNNA.The main research contents are as follows:First,the existing convolutional layer structure is analyzed and compared,such as additive trees,systolic arrays,and convolutional layer structures used in this article.Perform convolution operations according to different parallel modes to evaluate the advantages and disadvantages of different parallel modes and determine the design indicators of this scheme.Then,design the core structure of each module in CNNA.(1)Design a multichannel multiplexing,no processing element(PE)structure of the convolution kernel size limit,as long as the time to fetch data from the memory is less than the time it takes for the convolution kernel to process the data block,the utilization rate of the MAC in the PE unit is 100%;(2)A direct memory access(DMA)dedicated to CNNA is designed.This DMA separates the channels of different types of data and can perform CNN operations on the transmitted data.The required data deformation,the data in the memory can be transported and deformed between different blocks of Static RandomAccess Memory(SRAM)to support the external processor to read the data in the memory;(3)The pool is designed,this unit reserves space for stride parameter changes to be supported in the future through a special way of reading data and an enable signal design.Finally,a set of RISC-V instruction set dedicated to CNN calculation is designed.In view of the computational characteristics and common operations of CNN,the corresponding instructions are designed in this paper,which not only reduces the number of read instructions and the time to fetch data between instructions,but also improves the computational efficiency.This article uses simulation and Field Programmable Gate Array(FPGA)to jointly test the performance of the CNNA.Although the CNNA in this article is slightly lower than Veri Silicon's CNNA product VIP9000 in some performance indicators,its processing area is smaller.
Keywords/Search Tags:CNN, Parallel Computing, Extended Instructions, DMA, RISC-V
PDF Full Text Request
Related items