Efficient Hardware And Software Co-Design For Deep Neural Networks

Posted on:2020-04-29

Degree:Doctor

Type:Dissertation

Country:China

Candidate:P Q Wang

Full Text:PDF

GTID:1368330626964467

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Deep neural networks have promoted revolutionary progress in the field of artificial intelligence,as well as changed our lifestyle.These neural networks,which have achieved remarkable results in a wide range of applications,have huge demands for computing and storage resources.This limits the deployment scope of neural networks,especially in some resource-constrained platforms such as mobile terminals or embedded systems.Because the development speed of Moore's law is slowing down gradually,and the algorithms are still evolving at a fast speed,it is difficult to address these problems effectively from the hardware-field or software-field alone.In this paper,we adopt hardware and software co-design method to conduct in-depth research,including hardware adapting to software,hardware-software coupling optimization and software adapting to hardware.The main achievements and innovations include:� In the aspect of hardware adapting to software,we propose a data-centric convolutional neural network accelerator.Through utilizing the data reuse patterns in convolutional neural networks,we design a novel data stream and it reduces the amount of on-chip data transfer efficiency.The optimization of convolution,which is the major operation in networks,obtains efficient computational performance.Moreover,our architecture achieves low power consumption as well as small area overhead.� Considering hardware-software coupling optimization,we propose an efficient sparse neural network computing architecture based on resistive random-access memory,called SNrram.In the software level,SNrram adopts a hardware-friendly network pruning methods to generate structural sparse computation pattern.In the hardware level,it decomposes and reorganizes the structural sparse matrices to utilize the sparsity more efficiently.We design the Sparsity Transfer Algorithm to normalize sparse data in both weights and activations.The experiments show that SNrram achieves efficient performance on sparse neural networks.� In terms of software adapting to software,we propose two quantization methods of different neural networks.To address the accuracy decline in quantized recurrent neural network models,we propose Hit Net,a hybrid ternary recurrent neural network.Hit Net uses different quantization strategies according to their distribution,and successfully quantize all weights and partial activations in neural networks to ternary values {-1,0,1}.Hit Net further closes the accuracy gap between quantized model and the original model,and significantly outperform the state-of-the-art methods towards extremely quantized models.� In view of the instability in the training process of generative adversarial networks,we propose a quantization method called QGAN.Considering the underrepresentation of prior quantization methods in other networks,we develop a novel method for GANs based on EM algorithms and a multi-precision quantization strategy.QGAN successfully quantizes GAN to even 1-bit or 2-bit representations with results of quality comparable to original models.

Keywords/Search Tags:

Deep neural network, hardware and software co-design, computer architecture, sparsity, quantization

PDF Full Text Request

Related items

1	Research On Optimization And Acceleration Methods Of Deep Neural Network Models For Hardware Implementation
2	Research On Software And Hardware Co-design Method Of Deep Neural Network Accelerator
3	Deep Convolutional Neural Networks Compression Based On Sparsity And Quantization
4	Simulation Implementation Of Deep Learning Software And Hardware Co-design Based On FPGA
5	Co-design And Implementation Of Hardware/Software Of Convolutional Neural Network Based On FPGA
6	Study Of Low Bit-width Quantization Of Deep Convolutional Neural Network
7	Optimizations And Implementations For Key Components Of Deep Neural Networks
8	Neural Architecture Design And Training Method For Efficient Deep Neural Networks
9	Research On Binary Quantization Methods Of Deep Learning Models
10	Model Compression And Hardware Acceleration Of Convolutional Neural Networks