Font Size: a A A

Efficient Hardware And Software Co-Design For Deep Neural Networks

Posted on:2020-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Q WangFull Text:PDF
GTID:1368330626964467Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep neural networks have promoted revolutionary progress in the field of artificial intelligence,as well as changed our lifestyle.These neural networks,which have achieved remarkable results in a wide range of applications,have huge demands for computing and storage resources.This limits the deployment scope of neural networks,especially in some resource-constrained platforms such as mobile terminals or embedded systems.Because the development speed of Moore's law is slowing down gradually,and the algorithms are still evolving at a fast speed,it is difficult to address these problems effectively from the hardware-field or software-field alone.In this paper,we adopt hardware and software co-design method to conduct in-depth research,including hardware adapting to software,hardware-software coupling optimization and software adapting to hardware.The main achievements and innovations include:· In the aspect of hardware adapting to software,we propose a data-centric convolutional neural network accelerator.Through utilizing the data reuse patterns in convolutional neural networks,we design a novel data stream and it reduces the amount of on-chip data transfer efficiency.The optimization of convolution,which is the major operation in networks,obtains efficient computational performance.Moreover,our architecture achieves low power consumption as well as small area overhead.· Considering hardware-software coupling optimization,we propose an efficient sparse neural network computing architecture based on resistive random-access memory,called SNrram.In the software level,SNrram adopts a hardware-friendly network pruning methods to generate structural sparse computation pattern.In the hardware level,it decomposes and reorganizes the structural sparse matrices to utilize the sparsity more efficiently.We design the Sparsity Transfer Algorithm to normalize sparse data in both weights and activations.The experiments show that SNrram achieves efficient performance on sparse neural networks.· In terms of software adapting to software,we propose two quantization methods of different neural networks.To address the accuracy decline in quantized recurrent neural network models,we propose Hit Net,a hybrid ternary recurrent neural network.Hit Net uses different quantization strategies according to their distribution,and successfully quantize all weights and partial activations in neural networks to ternary values {-1,0,1}.Hit Net further closes the accuracy gap between quantized model and the original model,and significantly outperform the state-of-the-art methods towards extremely quantized models.· In view of the instability in the training process of generative adversarial networks,we propose a quantization method called QGAN.Considering the underrepresentation of prior quantization methods in other networks,we develop a novel method for GANs based on EM algorithms and a multi-precision quantization strategy.QGAN successfully quantizes GAN to even 1-bit or 2-bit representations with results of quality comparable to original models.
Keywords/Search Tags:Deep neural network, hardware and software co-design, computer architecture, sparsity, quantization
PDF Full Text Request
Related items