Font Size: a A A

Research On Optimization Of OpenCL For FPGA-based Deep Learning Applications

Posted on:2019-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:K LiangFull Text:PDF
GTID:2428330548994966Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the coming of the Big Data era,deep learning is widely used in various fields.Because of the characteristics of deep learning algorithms and applications,it is an inevitable trend to use hardware to accelerate it.Especially,FPGA platform has drawn much attention for its high energy efficiency,fast development round and reconfigurability.Recently,FPGA vendors such as Xilinx and Altera have released OpenCL SDK for programming FPGAs.However,it is difficult for most software developers to tune the OpenCL code of deep learning algorithms for good performance on FPGAs,for the research on optimization of OpenCL for FPGA-based deep learning is limited and the existing OpenCL tools and models designed for CPUs/GPUs are not directly applicable to FPGAs.For this question,an optimization strategy,based on the research for general and FPGA-based optimization techniques of OpenCL,is proposed in this thesis for the time-consuming convolutional layer in convolutional neural networks.The specific research work is as follows:1)Research on general and FPGA-based optimization techniques of OpenCL.Based on the deep learning algorithms,this thesis analyzed the general and FPGA-based optimization techniques of OpenCL and divided it into two categories: memory optimization and computing optimization.The optimization techniques involved in each category are analyzed in detail in terms of principles,usage patterns,etc.2)Research on optimization of OpenCL for convolutional layer on FPGA.Based on the general and FPGA-based optimization techniques of OpenCL,an optimization strategy suitable for convolution layer is proposed.Combined with a real convolution layer,this strategy is illustrated from the following two aspects: memory optimization and computing optimization.This thesis achieved convolution layers with different scales and optimized them following the proposed optimization strategy.The experimental results show that the performance of the program with the proposed optimization strategy is 8-40 times higher than the performance of the optimized program provided by Xilinx with the same scale.
Keywords/Search Tags:Deep learning, FPGA, OpenCL, Optimization techniques
PDF Full Text Request
Related items