Font Size: a A A

Research On Hardware Implementation And Optimization Technology Of Deep Learning

Posted on:2018-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:J J LinFull Text:PDF
GTID:2348330533969927Subject:Electrical engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rise of artificial intelligence,the new intelligent algorithm represented by deep learning has been successfully applied in many engineering applications,such as machine vision,image processing and pattern recognition,etc.However,under the impact of industrial big data,the traditional software can't meet the requirements of low cost,high real-time and high error-tolerant,so it's very urgent to find a new solution.As a common hardware platform,Field Programmable Gate Array(FPGA)has distributed hardware resources on a large scale,and its advantages lies in the features of short development cycle,low power consumption and good performance,etc.Therefore,FPGA is very suitable for the implementation of the compute-intensive deep learning algorithm.In this paper,FPGA is used as the hardware development platform to study the hardware implementation and optimization technology of deep learning.This paper makes contributions as follows:Firstly,the overall scheme of deep learning hardware implementation is designed.The basic theoretical knowledge of deep learning is analyzed in detail.This paper takes convolution neural network(CNN)as a typical example of deep learning,the topological structure and functional characteristics of the network are studied.And the specific network topology of hardware implementation is given.According to the structural characteristics of network topology,the overall scheme of the system is carried out,and the network topology is mapped to the specific hardware circuit.Secondly,optimization technology and architecture design are completed before the algorithm hardware implementation.Select FPGA as the hardware transplant platform in this paper.And the optimization techniques of hardware implementation are studied deeply to realize the aim of low-power and high efficiency deep learning algorithms.The parallel architecture design of convolution neural network from coarse granularity to fine granularity is completed by optimization technique.Thirdly,design and optimization of CNN based on FPGA are carried out.The overall architecture design of CNN is completed based on FPGA.And then,according to the structure characteristics of CNN,each functional circuit module is designed,including convolution calculation module,sampling calculation module and activation function module.This paper designs ping-pang cache structure to optimize the data transmission structure and data cache unit.Each model is verified the function correctness by simulation software Modelsim.Finally,the whole system experimental platform is built.According to the existing experimental conditions,the network structure and parameters are configured,and the heterogeneous system of “FPGA + CPU” is designed to complete the hardware solidification of CNN.This paper completes the contrast experiment of software and hardware implementation with the handwritten numeral recognition as the specific application.The results show that the CNN based on FPGA designed in this paper is fully operational,and has an excellent performance through a large number of experimental statistics.
Keywords/Search Tags:deep learning, neural network, hardware solidification, FPGA, parallel architecture, optimization technology
PDF Full Text Request
Related items