Font Size: a A A

Research On Key Technologies Of Reconfigurable Neural Network Accelerator Design

Posted on:2018-11-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:S LiangFull Text:PDF
GTID:1368330596452851Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of computing capability and data scale,the deep neural network(NN)based machine learning algorithms have shown excellent performance in various areas such as vision,speech,etc.The complexity of NN models are increasing,which makes higher requirement for the processors.General-purpose Von-Neumann processors can not offer satisfying performance,while reconfigurable processors have the advantages of reconfigurability,flexibility and high parallelism,which makes it an feasible choice.This thesis targets on the acceleration of the inference process of convolutional NN(CNN),and goes with both static reconfigurable FPGA and dynamic reconfigurable CGRA.It takes the optimization in data bitwidth,hardware granularity,datapath reuse and hierarchical memory system to make customized designs respectively.For the static reconfigurable design,we use FPGA as the platform to design FP-CNN firstly,which is a high efficiency acceleration system for common CNN models.Based on resource-aware modeling,we offer quantized analysis on computing and storage volume,memory access bandwidth and energy cost.We choose to quantize weights and feature maps(fmaps),and then realize a doubled datapath and on-chip storage design for weights and fmaps,which largely broaden the computing bandwidth.Secondly,we go deeper with more extreme quantization-binarization,which presents data in lbit,and design an acceleration architecture design FP-BNN for the binarized NN(BNN)model.We propose a customized datapath for bit-level vector multiplication based on XNOR and Hamming Weight computations,which avoids the utilization of multipliers,and realized ultra-high throughput with special design on memory and the system scheduling.For the dynamic reconfigurable design,we design the architecture with consideration of the characteristics of NN tasks,and implemented the coarse-grained dynamic reconfigurable processor chip Chameleon.We set the system around the chip with careful design in system architecture,configuration structure and compilation flow,which leads to a scalable,high-efficiency acceleration system.Through evaluations,we can see that the optimized design in datapath and memory bandwidth benefits FP-CNN in high speed and efficiency exceeding existing FPGA de-signs,and FP-BNN even gets a Tera OP/s speed performance which is even better then state-of-the-art GPUs.DP-CNN offers a dynamic acceleration solution with high effi-ciency,low power and on-time configurability,which earns its place in areas like mobile embedded systems.
Keywords/Search Tags:Reconfigurable Computing, Convolutional Neural Network, Data Quantization, Accelerator Design, Customized Computing
PDF Full Text Request
Related items