Optimal Designs Of Deep Neural Networks On TTA Based ASIP

Posted on:2020-11-30

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Wu

Full Text:PDF

GTID:2428330572476354

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of electronic chip technology,the structures of neural networks are designed to be deeper and larger to cope with more complex and abstract scenarios.The following problem is that deep neural networks need large-scale floating-point operations during both training and inference stages,leading to the high demand for storage resources and power consumption.Developing Transport Triggered Architecture(TTA)based Application Specific Instruction Set Processors(ASIPs),which can not only bring high performance and low power consumption,but also own the flexibility of instruction sets,is one of the approaches to solve this issue.Quantization and acceleration are two common techniques in hardware implementation of deep neural networks,while convolutional neural networks are the most representative and wildly-used deep neural networks.The principal work of this paper is to optimize convolutional neural networks on TTA based ASIP,which mainly includes two aspects of work.Firstly,aiming at the implementation requirements of TTA-based neural networks ASIPs,this paper proposed an end-to-end 8-bit quantization scheme.Different quantization strategies were designed according to the characteristics of weights,activations and gradients,including symmetric affine quantization for weights,dynamic upper limit quantization for activations,variable precision quantization for gradients and an approximate batch normalization algorithm.In the experiments on multiple datasets and multiple network structures,the proposed quantization scheme achieves the comparable accuracy as the full-precision neural networks,ahead of some frequently-used schemes.Secondly,this paper proposed a quantized convolution scheme based on multiplication combining law by integrating look-up table resources into convolution operations,and designed a quantized convolution function unit towards TTA.The kernel tiling,loop unrolling and data exchange schemes that accord with quantized convolution characteristics were provided and the overall structure of TTA-based accelerator was descripted.When compared with traditional convolution scheme,the proposed quantized convolution scheme alleviates the parallelism restriction caused by the limited number of multipliers,and improves both parallelism and energy efficiency.In this paper,the quantization scheme is the basis of the acceleration scheme,and the acceleration scheme supplements the quantization scheme.They together constitute the optimization work of this paper,and provide support for the TTA based ASIP implementation of the deep neural networks.

Keywords/Search Tags:

deep neural networks, quantized neural networks, convolution operation, fpga

PDF Full Text Request

Related items

1	Optimizing Deep Learning Computation On Mobile Devices
2	Research Of Deep Neural Networks Optimization Based On GPU
3	Research On Model Optimization Of Deep Convolutional Neural Networks
4	Research On Deep Learning Method Of Dual-channel Convolution Neural Networks
5	Image Aesthetic Evaluation Method Based On Convolutional Neural Networks
6	Study On Neural Networks Machine And Its Application In Control
7	Facial Expression Recognition Based On Deep Neural Network
8	Image Denoising Based On Deep Convolution Neural Network Method Research And Application
9	Research On Application Of Deep Learning Models For Feature Representation And Classification
10	Research On Hardware Parallel Acceleration For Novel Convolutional Neural Networks