Font Size: a A A

Design And Implementation Of Convolutional Neural Network Processor

Posted on:2018-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q YanFull Text:PDF
GTID:2348330533465866Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
The state-of-the-art Convolutional Neural Network is used in many aspects of application including image recognition, voice recognition, natural language processing and so on.Convolution neural network has the characteristics of data-intensive and computation-intensive.The traditional CPU platform can not fully exploit the parallelism of CNN,and it takes a long time, what's more, has higher cost of implementation. Dedicated CNN chip has the advantage of speed and cost, but it is poorly configurable, so that can not be flexible to adapt to the change of the number of different layers of CNN.By analyzing the characteristics and problems of CNN algorithm, a new type of convolutional neural network processor which can take into account the CNN parallel computing capabilities and flexibility was designed on the basis of general ZION processor by designing dedicated instruction and improving architecture. The main research contents are as follows:1. Dedicated Instructions Designing. Firstly, Through the analysis and statistics of the operation type of CNN algorithm, it is found that the convolution, subsampling, the activation function appear more frequently. According to these characteristics, this paper designs dedicated instructions to realize the corresponding operational functions. As a result,a dedicated instruction can complete the function which needs a number of intructions before. Secondly, we designed the vector access memory instructions, for reading and writing multiple data at one time, in order to reduce the number of memory access instruction, and improve the efficiency of memory access. Lastly, based on the RISC-V32 and it's extended ISA rules, dedicated instructions system were designed.2. Processor Architecture Design. In this paper, based on the ZION processor which has seven stage pipeline, the CNN instruction system is supported by adding the pipelined function units. Focusing on the data reuse of the convolution of the same convolution template at different positions of the input feature map in convolution operation, we designed reuse architecture to reduce the times of reading the data of feature map. As a result, the demands of memory access were reduced. In addition, in order to reduce the impact of memory latency on parallel computing, Dual-Buffer was used to cache the data of different feature map in different time, so as to reduce the idle time and improve the parallel efficiency.The pipelined function units were designed and implemented with Verilog HDL based on the designing of instruction and architecture, and a convolutional neural network processor with a seven stage pipeline structure was implemented, and simulated successfully. The CNN processor can not only support general algorithm, but also achieve a significant acceleration effect on the CNN algorithm. Aiming at the CNN algorithm, the MNIST handwritten numeral character data base is used as the sample set to test the CNN processor. Compared with the traditional ZION processor, the speed is increased by 6.955, and the speed area ratio is increased by 3.398.
Keywords/Search Tags:CNN, Self-Defined instructions, Microarchitecture, CPU
PDF Full Text Request
Related items