Design And Implementation Of Convolutional Neural Network Processor

Posted on:2018-01-25

Degree:Master

Type:Thesis

Country:China

Candidate:Q Yan

Full Text:PDF

GTID:2348330533465866

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

The state-of-the-art Convolutional Neural Network is used in many aspects of application including image recognition, voice recognition, natural language processing and so on.Convolution neural network has the characteristics of data-intensive and computation-intensive.The traditional CPU platform can not fully exploit the parallelism of CNN,and it takes a long time, what's more, has higher cost of implementation. Dedicated CNN chip has the advantage of speed and cost, but it is poorly configurable, so that can not be flexible to adapt to the change of the number of different layers of CNN.By analyzing the characteristics and problems of CNN algorithm, a new type of convolutional neural network processor which can take into account the CNN parallel computing capabilities and flexibility was designed on the basis of general ZION processor by designing dedicated instruction and improving architecture. The main research contents are as follows:1. Dedicated Instructions Designing. Firstly, Through the analysis and statistics of the operation type of CNN algorithm, it is found that the convolution, subsampling, the activation function appear more frequently. According to these characteristics, this paper designs dedicated instructions to realize the corresponding operational functions. As a result,a dedicated instruction can complete the function which needs a number of intructions before. Secondly, we designed the vector access memory instructions, for reading and writing multiple data at one time, in order to reduce the number of memory access instruction, and improve the efficiency of memory access. Lastly, based on the RISC-V32 and it's extended ISA rules, dedicated instructions system were designed.2. Processor Architecture Design. In this paper, based on the ZION processor which has seven stage pipeline, the CNN instruction system is supported by adding the pipelined function units. Focusing on the data reuse of the convolution of the same convolution template at different positions of the input feature map in convolution operation, we designed reuse architecture to reduce the times of reading the data of feature map. As a result, the demands of memory access were reduced. In addition, in order to reduce the impact of memory latency on parallel computing, Dual-Buffer was used to cache the data of different feature map in different time, so as to reduce the idle time and improve the parallel efficiency.The pipelined function units were designed and implemented with Verilog HDL based on the designing of instruction and architecture, and a convolutional neural network processor with a seven stage pipeline structure was implemented, and simulated successfully. The CNN processor can not only support general algorithm, but also achieve a significant acceleration effect on the CNN algorithm. Aiming at the CNN algorithm, the MNIST handwritten numeral character data base is used as the sample set to test the CNN processor. Compared with the traditional ZION processor, the speed is increased by 6.955, and the speed area ratio is increased by 3.398.

Keywords/Search Tags:

CNN, Self-Defined instructions, Microarchitecture, CPU

PDF Full Text Request

Related items

1	Physically-aware synthesis and microarchitecture design
2	The Study Of GPGPU Microarchitecture And Performance Analysis
3	Design And Implementation Of A Microprocessor With CNN Extension Instructions
4	Characterizing, modeling and mitigating microarchitecture vulnerability and variability in light of small-scale processing technology
5	M-ary Spectrum Spreading Communication System Based On CPU Software Defined Radio Platform
6	Microarchitecture-aware physical planning for deep submicron technology
7	Buffer-oriented microarchitecture verification
8	CORDIC instructions for software defined radio
9	Research Of Voice Control Instructions Recognition Arithmetic In Vehicle
10	The microarchitecture of FPGA-based soft processors