Design And Research Of Deep Learning Heterogeneous Computing System Based On FPGA

Posted on:2021-09-10

Degree:Master

Type:Thesis

Country:China

Candidate:H N Qiang

Full Text:PDF

GTID:2518306122967009

Subject:IC Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,deep learning has been widely used in various fields,especially in computer vision and speech recognition,and has promoted the progress of artificial intelligence technology.Huge computing power is the foundation to support the development of deep learning.How to meet the increasing demand for computing power has become a hot research direction.Heterogeneous computing has become a mainstream solution to improve computing power with its excellent performance and flexible structure.However,for the complex neural network,to carry out a large amount of data throughput operations,from algorithm to hardware structure need a lot of optimization.The high-performance and low-power FPGA platform solves the problem of requiring highly parallel hardware structures.Based on Open CL and ZYNQ heterogeneous computing frameworks,this paper compares and analyzes the characteristics of on-chip and inter-chip heterogeneous,and proposes an optimization and implementation method of heterogeneous computing based on FPGA.In this method,it is necessary to divide the computing task and assign the same operation code segment to the same subtask.According to the requirements of subtask,a reusable and configurable acceleration kernel is developed in FPGA,which makes it have high reuse rate and flexibility.The highly parallel kernel design exploits the computing power of the hardware and increases the systems throug hput through the pipeline design between the cores.In addition,by optimizing memory access and interface communication,the performance and energy consumption of the kernel are improved.Compared with the traditional heterogeneous computing platform,it has a flexible structure and excellent energy consumption ratio,which can be used in small embedded devices without affecting the classification accuracy.In this paper,we verify the implementation method on DE5-Net FPGA Development Kit and Zynq XC7z035ffg676 development boards.Based on the theoretical model of hardware resource,bandwidth and power consumption,the peak floating-point operation speed of neural network on the device is obtained.The experimental results of convolution neural network based on this method show that the image recognition only needs 1.1ms and the power is 2.5W.Compared with the operation efficiency of arm with the same power consumption,the convolution operation speed based on heterogeneous platform is 46.8 times faster.

Keywords/Search Tags:

heterogeneous computing, hardware acceleration, low power consumption, neural network, kernel reuse

PDF Full Text Request

Related items

1	Study Of Heterogeneous Multi-core Acceleration Methods For Convolutional Neural Networks On Reconfigurable Platform
2	Hardware Accelerator Design Of Convolutional Neural Networks For Low Power And High Performance
3	Power Consumption Prediction For Multi-Type Jobs And Load-Balancing Method On CPU/GPU Heterogeneous Systems
4	Design And Implementation Of Deep Convolutional Neural Networks Acceleration System Based On Heterogeneous Processor
5	Deep Convolution Algorithm Optimization And Hardware Acceleration
6	Heterogeneous computing environment characterization and thermal-aware scheduling strategies to optimize data center power consumption
7	The Design And Implementation Of A Local Multi-port Computing Acceleration Device Based On FPGA
8	High Performance Artificial Intelligence Computing With Algorithm-hardware Co-design
9	A Study Of Data Scheduling Management Strategy In Heterogeneous System Based On Hardware Acceleration
10	Research And Implementation Of Recommendation Algorithm Accellerator Based On Heterogeneous Computing Platform