Research On Key Technologies Of Energy Efficient And Reconfigurable Accelerator For Generative Neural Networks

Posted on:2020-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:J L Yan

Full Text:PDF

GTID:2428330626464599

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

In recent years,artificial intelligence neural network technology has played an increasingly important role in various civil scenes.It is widely used in various fields such as computer vision,speech recognition,and automatic drive.The generative neural network is a neural network composed of convolution(CONV)layers,deconvolution(De CONV)layers,and residual modules.It plays a key role in the field of computer vision,such as image super-resolution processing,style migration,and has important commercial application value.Such intelligent applications also require computing devices to have highperformance,strong real-time,adaptive,low-power information processing capabilities.Since the generative neural network has different characteristics from the traditional convolution calculation,the traditional neural network accelerator focuses too much on optimizing CONV,but has too low hardware resource utilization in computing De CONV layers and residual blocks.In this paper,the main goal is improving the energy efficiency about the accelerator,we base on the reconfigurable hardware technology,neural network compression algorithm,neural network-on-chip mapping,and scheduling,and consider operational characteristics of the neural network itself.Aiming at the specific generative neural network acceleration problems,a hardware acceleration architecture and neural network mapping and scheduling technology scheme are proposed.The main contents of this paper are as follows:1.This paper proposes a reconfigurable precision adaptive processing unit(PE)and reconfigurable buffer bandwidth.After analyzing the neural network model by the fixedpoint compression algorithm,the input data of each layer and the bit-width precision of the weight data are different.The hardware resource utilization of the parallel multiplier in the traditional neural network accelerator is not high.This paper proposes a solution by using reconfigurable PEs.The accelerator can perform multi-precision tasks,and the parallel computing power of the architecture is different under different precision modes.At the same time,the scheme of reconfigurable buffer bandwidth is proposed.The bandwidth can be changed in order to match the different calculation power,which solves the computational power and bandwidth mismatch problems.For the applications including generative network models like style-transfer and image segmentation,the fixed-point compression technology is carried out to explore the relationship between compression strength,performance and neural network accuracy.2.Analyze the computational properties of the generative network model and reveal the characteristics of the duality of convolution and deconvolution operations.Aiming at the problem of unbalanced PE load,this paper explores the process of mapping computation tasks to on-chip PE MAC.For the deconvolution operation,the input-oriented mapping method(IOM)is proposed.For the convolution operation,the output-oriented mapping method(OOM)is proposed.This paper also fully considers the computing mode,data reuse technology,and the co-operation between various hardware modules.Compared with the traditional convolution mapping scheme,it can effectively improve the utilization of hardware resources and improve the spatial parallel computing capability of the architecture.3.Analyze the low operation density of a residual block.The traditional layer-bylayer computing mode causes large power consumption and low computational efficiency when implementing the residual blocks.A cross-layer data flow scheduling scheme is proposed.With the full consideration of the cascading relationship in the residual structure,we use the mixed layer calculation processing method.This method effectively avoids the extra off-chip memory access of the accelerator and makes the element-wise additions and CONV layers be executed in parallel on the GNA.Experiments show that the scheme reduces the energy power and improves the ability of computing power.The GNA accelerator has an average power consumption of 142 m W under 200 MHz,achieving a computing power of 409.6GOPS.It supports 8-bit,16-bit and other mixedprecision operations,and achieves 2.05TOPS/W energy efficiency for the generative neural network tasks.It meets the needs of neural networks for low-power,high-performance,high-flexibility of hardware accelerators.

Keywords/Search Tags:

Reconfigurable Computing, Deep Learning, Generative Network, Accelerator Design

PDF Full Text Request

Related items

1	A Reconfigurable Accelerator For Deep Learning Training Based On FPGA
2	Research On Key Technologies Of Reconfigurable Neural Network Accelerator Design
3	The Study Of Many-core Deep Learning Accelerator Based On BWDSP
4	Research And Design For High Performance Cnn Hardware Accelerator
5	Research On Reconfigurable Accelerator For Deep Learning Algorithms
6	Design And Implimentation Of Energy-Efficient Binary Neural Network Accelerator
7	Design And Implementation Of A High-performance Accelerator Dedicated For Convolutional Neural Networks
8	High-efficiency Reconfigurable Array Computing: Architecture, Methodology And Application Mapping Technology
9	Deep Learning Accelerator Design And Implementation Based On FPGA
10	Research On Storage And Computing Optimization Technology In Deep Learning Accelerator