Research On Hardware Parallel Acceleration For Novel Convolutional Neural Networks

Posted on:2020-08-23

Degree:Master

Type:Thesis

Country:China

Candidate:D G Wang

Full Text:PDF

GTID:2518306548995939

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As one of the most popular algorithms in deep learning,Convolutional Neural Networks(CNN)has achieved great success in computer applications,and is widely used in speech recognition,image segmentation,image recognition and other fields.In order to improve the performance of CNN,the number and size of network layers are gradually increasing.However,simply relying on the increase in the number of network layers has encountered bottlenecks,and some novel convolutional neural networks have been proposed,such as deconvolution neural networks and complicated-connected convolutional neural networks.The structure of these network models is more complex,and the computational complexity is also greatly increased.Traditional CPU general-purpose processors have been unable to meet the computational demands of modern convolutional neural networks due to their low degree of parallelism and limited computing capability.Therefore,in order to enable large-scale applications of convolutional neural networks,many hardware accelerators for convolutional neural networks have emerged,such as GPUs,FPGAs,and ASICs.Due to its reconfigurability,computational resource richness and low power consumption,FPGA is favored in hardware acceleration research for convolutional neural networks.Previous FPGA-based hardware acceleration efforts focused on the design and optimization of traditional convolutional neural network accelerators,but FPGA accelerators for novel convolutional neural networks are still lacking.In this paper,we present an FPGA-based sparse deconvolution neural network accelerator architecture.We implemented our design on the Xilinx VC709 development platform and evaluated the resource utilization of the accelerator.Finally,the performance of four practical deconvolution neural network models was tested.With the development of deep learning,the structure of convolutional neural networks is more complex,the parameters are getting larger,and the computing and storage requirements are getting higher.The limited on-chip computing and storage resources of a single FPGA are difficult to meet the needs of mapping the entire network.This makes it difficult to increase the acceleration efficiency of a single FPGA.In this paper,we present a efficient design flow to accelerate complicatedconcatenated convolutional neural networks on multiple FPGA platforms,including directed acyclic graph(DAG)abstraction,mapping scheme generation,and design space exploration.Finally,we built a multi-FPGA system that supports flexible communication between FPGAs to support our design flow.We chose Goog Le Net,Dense Net and LNS-net three complicated-connected convolutional neural networks as benchmarks.The experimental results show that the proposed multi-FPGA system design is much higher in throughput and energy efficiency than CPU and GPU.

Keywords/Search Tags:

Novel Convolutional Neural Networks, FPGA, Deconvolu-tional Neural Networks, Complicated-connected Convolutional Neural Net-works, Multi-FPGA

PDF Full Text Request

Related items

1	Research On Binarization Of Convolutional Neural Network And FPGA Implementation
2	Research On Acceleration Of Low-Precision Convolutional Neural Networks On FPGA
3	Implementation And Research Of FPGA-based Convolutional Neural Network Accelerator
4	A Convolutional Neural Network Accelerating Circuit Design And FPGA Implementation
5	Quantitative Research On Convolutional Neural Networks And FPGA Implementation
6	Research On Key Technologies Of Hardware Implementation Of Convolutional Neural Networks
7	Research And Design Of Convolutional Neural Network Accelerator Based On Multi-FPGA Co-acceleration
8	Research On FPGA-based Accelerator Design For Convolutional Neural Networks
9	Research On Acceleration Of Convolutional Neural Networks On FPGA Based On OpenCL
10	Research Of Acceleration Technology For Convolutional Neural Networks Based On FPGA