Font Size: a A A

Research On Parallel Acceleration Architecture Convolutional Neural Network Based On FPGA

Posted on:2019-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:W YinFull Text:PDF
GTID:2428330572958981Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
With the advent of artificial intelligence and big data era,convolution neural network(convolutional neural network,CNN)have become of more and more high.Convolutional neural network is a multi-layer neural network,In-depth research on it has important research significance and application value in image classification,pattern recognition,object detection,video monitoring,machine vision,scientific computing,and other fields.The convolutional neural network is a feedforward network structure,which is independent of each other and has high parallelism.Therefore,more and more researchers use FPGA to develop the application of convolutional neural network.Field Programmable Gate Array(FPGA)as a Programmable logic device has the advantages of rich programming logic resources,high performance and low power consumption.Heterogeneous computing method based on FPGA + CPU used the OpenCL oriented development mode,which not only makes full use of the high degree of parallelism and low power consumption of FPGA,but also has a shorter development cycle and better performance.Firstly,this paper analyzes the basic concepts and application scenarios of convolutional neural networks.Because the convolutional neural network has a unique network topology,so the structural characteristics and working principle of the convolutional neural network are analyzed.The unique structural characteristics of CNN make it have multiple parallel modes,fully exploiting the parallel nature of convolutional neural networks is crucial for the parallel operation of CNN.Therefore,this paper makes a detailed analysis of the advantages and disadvantages of different CNN's layers,convolution calculations and other parallel models.In this paper,the heterogeneous computing method of CPU+FPGA is used to optimize and accelerate the convolutional neural network.Therefore,this paper analyzes the basic structure model of OpenCL standard,and gives the optimization strategies of data parallelism,task parallelism and memory access.Then using DE5-NET FPGA development board to build a CPU + FPGA heterogeneous experimental platform.And for FPGA-specific logic structure,the design flow and overall architecture based on FPGA heterogeneous computing are given.Finally,this paper takes the unmanned vehicle convolutional neural network algorithm as the experimental object,analysis the intrinsic parallelism of the algorithm,and uses OpenCL to implement the kernel algorithm on the DE5-net FPGA platform.The CNN kernel code of the unmanned vehicle is optimized by using the optimization strategy of local memory,vectorization and calculation unit copying and loop unrolling.The operation time of the algorithm,which has been optimized,for processing road images reaches 96.85 ms,and the actual throughput can reach 49.5GFLOP.Finally,compared with the CPU platform acceleration effect and computing power,the experimental results show that using CPU + FPGA heterogeneous computing method to achieve unmanned convolutional neural network acceleration is 3.19 times that of the CPU,while the power consumption is CPU's 10/186.Results show that compared with traditional way,this paper designed the parallel acceleration system based on the FPGA platform oriented OpenCL standard is designed to ensure the correctness of the algorithm,effectively improve the calculation efficiency,and significantly reduce the power consumption of the system,and provides theoretical guidance for the realization of large-scale convolutional neural networks.
Keywords/Search Tags:Convolutional neural network, FPGA, OpenCL, Heterogeneous computing
PDF Full Text Request
Related items