An FPGA-based Accelerator For Sparse Neural Networks

Posted on:2019-05-24

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Lu

Full Text:PDF

GTID:2428330542994226

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

With the emerging focus,large scale neural networks achieve a better performance in different domains.Large scale means amounts of neurons and synapses,which features computational and memory intensive.It is the reason which makes it diffi-cult to deploy and execute neural networks on resource-limited devices.To overcome the difficulty,sparse neural networks present,which reduce redundant neurons and synapse weights.Nevertheless,conventional accelerators cannot benefit from the spar-sity,whose performance speedup doesn't match the extent of the parameter reduction.It requires newly accelerators for sparse neural networks.Field Programmable Gate Arrays(FPGA)is a useful hardware to implement specialized accelerators,which fea-tures high flexibility,low expense and short development cycles.This paper uses two pruning techniques to generate different types of sparse neural networks,meanwhile,analyzes computation and parameter structures in different layers,and implement two accelerators based on FPGA,the researches of this paper are as follows:1.Considering the sparsity of parameters of sparse neural networks which are generated by the pruning technique,this paper analyzes computations of the prediction stage,and chooses suitable compression formats for parameters and designs correspond-ing computation cores;in addition,according to FPGA resources,this paper decides the number of computation units,at last implements an accelerator for sparse neural net-works.2.Due to the low speedup in convolutional layers of the FPGA accelerator,this paper employs different pruning techniques in convolutional and full-connected layers.Analyzing the computation and parameter structures in newly sparse neural networks,this paper use a suitable compression format for parameters and designs computation cores;With the FPGA resources,this paper decides computation units,at the end im-plements an improved accelerator.3.Verifying the performance of two sparse neural network accelerators.According to different weight sparsity,this paper compares the trend of compression performance,including the compression efficiency and the performance of computation cores for matrix-matrix and matrix-vector multiplications.Moreover,this paper processing dif-ferent sparse neural network models,and compares the performance of two accelerators over dense neural network accelerators.Experiment results demonstrate that two accelerators can compress sparse weights and accelerate computation.Between two accelerators,the improved accelerator has a better utility of weight sparsity and computation units ability,which achieves a better performance than the other.

Keywords/Search Tags:

Neural Network Pruning, Sparse Neural Networks, Sparse Weight Compression, FPGA Accelerators

PDF Full Text Request

Related items

1	Study Of Sparse Neural Networks And Sparse Neural Network Accelerators
2	Research And Implementation On Deep Convolutional Neural Network Compression Algorithm
3	Pruning Neural Networks Based On Stochastic Gradient Sparse Optimization
4	Research And Application Of Neural Network Model Compression Based On Weight Pruning
5	Compression Method Research Of Deep Convolution Neural Network Model By Low-rank And Sparse Decomposition
6	Pruning-Based Compression Method For Convolutional Neural Network
7	Research On Convolutional Neural Network Compression Method Based On Dynamic Pruning And Weight Resetting
8	Research On Convolutional Neural Network For Compression Algorithm
9	Convolutional Neural Network Compression By Fusing Weight And Filter Pruning
10	Research On Face Recognition Based On Sparse CNN On Microcontroller Chip