Font Size: a A A

An FPGA-based Accelerator For Sparse Neural Networks

Posted on:2019-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y T LuFull Text:PDF
GTID:2428330542994226Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the emerging focus,large scale neural networks achieve a better performance in different domains.Large scale means amounts of neurons and synapses,which features computational and memory intensive.It is the reason which makes it diffi-cult to deploy and execute neural networks on resource-limited devices.To overcome the difficulty,sparse neural networks present,which reduce redundant neurons and synapse weights.Nevertheless,conventional accelerators cannot benefit from the spar-sity,whose performance speedup doesn't match the extent of the parameter reduction.It requires newly accelerators for sparse neural networks.Field Programmable Gate Arrays(FPGA)is a useful hardware to implement specialized accelerators,which fea-tures high flexibility,low expense and short development cycles.This paper uses two pruning techniques to generate different types of sparse neural networks,meanwhile,analyzes computation and parameter structures in different layers,and implement two accelerators based on FPGA,the researches of this paper are as follows:1.Considering the sparsity of parameters of sparse neural networks which are generated by the pruning technique,this paper analyzes computations of the prediction stage,and chooses suitable compression formats for parameters and designs correspond-ing computation cores;in addition,according to FPGA resources,this paper decides the number of computation units,at last implements an accelerator for sparse neural net-works.2.Due to the low speedup in convolutional layers of the FPGA accelerator,this paper employs different pruning techniques in convolutional and full-connected layers.Analyzing the computation and parameter structures in newly sparse neural networks,this paper use a suitable compression format for parameters and designs computation cores;With the FPGA resources,this paper decides computation units,at the end im-plements an improved accelerator.3.Verifying the performance of two sparse neural network accelerators.According to different weight sparsity,this paper compares the trend of compression performance,including the compression efficiency and the performance of computation cores for matrix-matrix and matrix-vector multiplications.Moreover,this paper processing dif-ferent sparse neural network models,and compares the performance of two accelerators over dense neural network accelerators.Experiment results demonstrate that two accelerators can compress sparse weights and accelerate computation.Between two accelerators,the improved accelerator has a better utility of weight sparsity and computation units ability,which achieves a better performance than the other.
Keywords/Search Tags:Neural Network Pruning, Sparse Neural Networks, Sparse Weight Compression, FPGA Accelerators
PDF Full Text Request
Related items