Study Of Sparse Neural Networks And Sparse Neural Network Accelerators

Posted on:2020-08-06

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X D Zhou

Full Text:PDF

GTID:1368330572969072

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Neural networks have become the dominant algorithms rapidly as they achieve state-of-the-art performance in a broad range of applications such as image recognition,object detection,speech recognition and natural language processing.However,neural networks keep moving towards deeper and larger architectures,posing a great chal-lenge to the huge amount of data and computations.Although sparsity has emerged as an effective solution for reducing the intensity of computation and memory accesses di-rectly,irregularity caused by sparsity(including sparse synapses and neurons)prevents processing platforms,including CPU,GPU and accelerators from completely leverag-ing the benefits.In this dissertation,we propose a cooperative software/hardware approach to ad-dress the irregularity of sparse neural networks efficiently.Initially,based on a wide range of experiments,we observe the local conver-gence,namely larger weights tend to gather into small clusters during training rather than randomly distributed.Based on that key observation,we propose a software-based coarse-grained pruning technique to reduce the irregularity of sparse synapses drasti-cally.Instead of pruning synapses independently,our proposed coarse-grained pruning prunes several synapses together.The synapses are firstly divided into blocks;a block of synapses will be permanently removed from the network topology if it meets specific criteria.We then employ the fine-tuning approach to retain the network accuracy.Note that we apply the coarse-grained pruning iteratively in training to achieve better sparsity and avoid the accuracy loss.The coarse-grained pruning can reduce the irregularity by 20.13 × on average.Then we introduce a novel compression algorithm,a three stage pipeline:coarse-grained pruning,local quantization and entropy encoding,that work together to reduce the storage requirement of AlexNet and VGG16 by 79 x and 98 x,respectively.The compression ratio is much higher than that achieved by two existing state-of-the-art neural network compression methods,i.e.,Deep Compression(35 x and 49 x)and CNNPack(39 x and 46 x).We further design a hardware accelerator named Cambricon-S to address the re-maining irregularity of sparse synapses and neurons efficiently.The novel accelerator features a central neural selector module(NSM)to leverage coarse-grained sparsity.Additional the synapse selector module(SSM),Encoder and weight decoding mod-uel(WDM)are used to leverage neuron sparsity,dynamically compress the neurons and leverage local quantization,respectively.Compared with a state-of-the-art sparse neural network accelerator Cambricon-X,our accelerator is 1.71 x and 1.75 x better in terms of performance and energy efficiency,respectively.To ease the burden of pro-grammers,we also propose a high efficient library-based programming environment for our accelerator.The compiler is able to apply loop tiling and data reuse strategies for highly efficient instructions.

Keywords/Search Tags:

neural networks, sparsity, compression, accelerator

PDF Full Text Request

Related items

1	Research On Key Issues Of Brain-inspired Computational Model And Hardware Acceleration
2	Bit-width And Sparsity Adaptive Accelerator Research And Design For Convolution Neural Network
3	Design Of Heterogeneous Neural Network Accelerator Based On Pruning And Sparsity Optimization
4	Design Of Energy-Efficient RNN Accelerator Based On Network Compression And Voltage-Precision Scaling
5	Algorithm Of SVD Compressing Convolutional Neural Networks And Hardware Accelerator Design
6	Convolution Neural Network Accelerator For General DSP
7	Research On Key Technology Of Sparse Recurrent Neural Network Customized Accelerator
8	Research On Neural Network Accelerator Customization Method For Large-scale Reconfigurable Hardware
9	An FPGA-based Accelerator For Sparse Neural Networks
10	Accelerator Implementation And Optimization For Convolutional Neural Networks