Bit-width And Sparsity Adaptive Accelerator Research And Design For Convolution Neural Network

Posted on:2019-10-15

Degree:Master

Type:Thesis

Country:China

Candidate:J X Guo

Full Text:PDF

GTID:2428330590951658

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

Convolutional neural networks(CNNs)have achieved great success in many computer vision applications,but the high computational complexity that they present has hindered the performance improvement.Recently,various FPGA-based accelerators have been proposed to improve the performance of CNNs.A few previous researches have proved that per-layer of the CNNs requires different short bit-width and the sparse CNN technique has proved that eliminating weights results in a network with a substantial number of zero values,which can potentially reduce the computational requirements of accelerator.However,state-of-the-art most FPGA-based accelerators only use the same bit-width selection for all CNN layers and don't consider eliminating the zero values of sparse CNN.They are usually to implement an accelerator that we call as a convolutional processor(CP),which iteratively computes the CNN layers one at a time with the same selection of the bit-width.This leads to very low resource utilization and much difficulty in further performance improvement.In order to address the problem,in this paper,we propose a short bit-width and sparsity adaptive accelerator design for CNN which can simultaneously adapt to the short bit-width CNN and sparse CNN with various bit-width requirements in each layer.We optimize the DSP operations to get higher DSP utilization when the network has short bit-width requirements and construct multiple different bit-width CPs to process the CNN layers in parallel way.Then,we propose sparse CNN accelerator design approach to joint short bit-width weight taking kernel as unit based on the DSP optimization.Finally,we use short bit-width CNN and sparse CNN optimization to get optimal kernels combination and optimal hardware structure.As a result,our approach can get higher throughput over the state-of-the-art FPGA-based CNN accelerators from 5.68 x to 6.67 x and by 6.17 x on average.

Keywords/Search Tags:

convolutional neural network, accelerator, FPGA, short bit-width CNN, sparsity CNN

PDF Full Text Request

Related items

1	FPGA-Based Accelerator For Convolutional Neural Network
2	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA
3	Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks
4	Research Of Scalability On FPGA-based Neural Network Accelerator
5	Implementation Of Convolutional Neural Network Accelerator Based On FPGA In Intelligent Orthokeratology Matching Algorithm
6	Parallel Accelerator Design For Convolutional Neural Networks Based On FPGA
7	Research On Memory Bus Width Aware Compression Technology Of Image Super-resolution Model Algorithm Based On FPGA
8	Research On Lightweight Convolutional Neural Network Accelerator
9	Accelerator Design And Research Of Depthwise Separable Convolutional Neural Network Based On FPGA
10	Deep Learning Accelerator Design And Implementation For EEG Classification On FPGA