Font Size: a A A

Optimal Designs Of Deep Neural Networks On TTA Based ASIP

Posted on:2020-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Y WuFull Text:PDF
GTID:2428330572476354Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of electronic chip technology,the structures of neural networks are designed to be deeper and larger to cope with more complex and abstract scenarios.The following problem is that deep neural networks need large-scale floating-point operations during both training and inference stages,leading to the high demand for storage resources and power consumption.Developing Transport Triggered Architecture(TTA)based Application Specific Instruction Set Processors(ASIPs),which can not only bring high performance and low power consumption,but also own the flexibility of instruction sets,is one of the approaches to solve this issue.Quantization and acceleration are two common techniques in hardware implementation of deep neural networks,while convolutional neural networks are the most representative and wildly-used deep neural networks.The principal work of this paper is to optimize convolutional neural networks on TTA based ASIP,which mainly includes two aspects of work.Firstly,aiming at the implementation requirements of TTA-based neural networks ASIPs,this paper proposed an end-to-end 8-bit quantization scheme.Different quantization strategies were designed according to the characteristics of weights,activations and gradients,including symmetric affine quantization for weights,dynamic upper limit quantization for activations,variable precision quantization for gradients and an approximate batch normalization algorithm.In the experiments on multiple datasets and multiple network structures,the proposed quantization scheme achieves the comparable accuracy as the full-precision neural networks,ahead of some frequently-used schemes.Secondly,this paper proposed a quantized convolution scheme based on multiplication combining law by integrating look-up table resources into convolution operations,and designed a quantized convolution function unit towards TTA.The kernel tiling,loop unrolling and data exchange schemes that accord with quantized convolution characteristics were provided and the overall structure of TTA-based accelerator was descripted.When compared with traditional convolution scheme,the proposed quantized convolution scheme alleviates the parallelism restriction caused by the limited number of multipliers,and improves both parallelism and energy efficiency.In this paper,the quantization scheme is the basis of the acceleration scheme,and the acceleration scheme supplements the quantization scheme.They together constitute the optimization work of this paper,and provide support for the TTA based ASIP implementation of the deep neural networks.
Keywords/Search Tags:deep neural networks, quantized neural networks, convolution operation, fpga
PDF Full Text Request
Related items