Research On Hardware Implementation And Optimization Technology Of Deep Learning

Posted on:2018-02-07

Degree:Master

Type:Thesis

Country:China

Candidate:J J Lin

Full Text:PDF

GTID:2348330533969927

Subject:Electrical engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the rise of artificial intelligence,the new intelligent algorithm represented by deep learning has been successfully applied in many engineering applications,such as machine vision,image processing and pattern recognition,etc.However,under the impact of industrial big data,the traditional software can't meet the requirements of low cost,high real-time and high error-tolerant,so it's very urgent to find a new solution.As a common hardware platform,Field Programmable Gate Array(FPGA)has distributed hardware resources on a large scale,and its advantages lies in the features of short development cycle,low power consumption and good performance,etc.Therefore,FPGA is very suitable for the implementation of the compute-intensive deep learning algorithm.In this paper,FPGA is used as the hardware development platform to study the hardware implementation and optimization technology of deep learning.This paper makes contributions as follows:Firstly,the overall scheme of deep learning hardware implementation is designed.The basic theoretical knowledge of deep learning is analyzed in detail.This paper takes convolution neural network(CNN)as a typical example of deep learning,the topological structure and functional characteristics of the network are studied.And the specific network topology of hardware implementation is given.According to the structural characteristics of network topology,the overall scheme of the system is carried out,and the network topology is mapped to the specific hardware circuit.Secondly,optimization technology and architecture design are completed before the algorithm hardware implementation.Select FPGA as the hardware transplant platform in this paper.And the optimization techniques of hardware implementation are studied deeply to realize the aim of low-power and high efficiency deep learning algorithms.The parallel architecture design of convolution neural network from coarse granularity to fine granularity is completed by optimization technique.Thirdly,design and optimization of CNN based on FPGA are carried out.The overall architecture design of CNN is completed based on FPGA.And then,according to the structure characteristics of CNN,each functional circuit module is designed,including convolution calculation module,sampling calculation module and activation function module.This paper designs ping-pang cache structure to optimize the data transmission structure and data cache unit.Each model is verified the function correctness by simulation software Modelsim.Finally,the whole system experimental platform is built.According to the existing experimental conditions,the network structure and parameters are configured,and the heterogeneous system of �FPGA + CPU� is designed to complete the hardware solidification of CNN.This paper completes the contrast experiment of software and hardware implementation with the handwritten numeral recognition as the specific application.The results show that the CNN based on FPGA designed in this paper is fully operational,and has an excellent performance through a large number of experimental statistics.

Keywords/Search Tags:

deep learning, neural network, hardware solidification, FPGA, parallel architecture, optimization technology

PDF Full Text Request

Related items

1	Research On Optimization And Acceleration Methods Of Deep Neural Network Models For Hardware Implementation
2	Research On Hardware Acceleration Method Of Deep Convolutional Neural Network Based On FPGA
3	The Research And Implementation Of Deep Learning Heterogeneous Computing Platform Based On CPU And Multiple FPGA Architecture
4	Hardware Acceleration Design And Research Based On FPGA And Deep Learning Algorithm
5	Research On Parallel Computing Architecture Of Multiple CNN Models On FPGA
6	The FPGA Programmable Neural Network Processor Design
7	Implementation And Application Of Hardware Accelerator Based On Image Recognition Technology
8	Simulation Implementation Of Deep Learning Software And Hardware Co-design Based On FPGA
9	Design And Implementation Of Feedforward Neural Network And Particle Swarm Optimization Based On FPGA
10	Neural Architecture Design And Training Method For Efficient Deep Neural Networks