| The deep convolution neural network usually contains billions of parameters,and its computation and memory consumption are very large,which seriously hinders its application and deployment in the light-weight mobile terminal where hardware resources and power consumption are limited.At present,GPU,the mainstream neural network computing platform,cannot be applied in mobile devices because of its power consumption,size and energy efficiency.Field programmable gate array(FPGA)has rich logic and computing resources.Because of its high parallelism,high energy efficiency ratio,flexible and configurable characteristics,it is very suitable for carrying deep convolution neural network on the mobile terminal.But using FPGA as the platform to accelerate the implementation of deep neural network is often faced with heavy computing tasks and memory consumption,so reducing resource consumption and reasonable allocation of memory are of great significance for the application deployment of convolutional neural network in the mobile end.This thesis focuses on the realization and acceleration of convolutional neural network on FPGA platform,and finally chooses binary convolutional neural network as the acceleration object of FPGA platform.Before the platform transplantation of the network,the following optimization and improvement are made: firstly,the derivative approximation problem in the training process after binarization of convolutional neural network is discussed.The derivative of the forward propagation function of the binary neural network is a pulse function,which is not suitable for back propagation.In this thesis,the derivative of the piecewise cubic function is used to approximate the pulse function.Secondly,a pre training method of initializing the binary neural network is proposed,which first trains the floatingpoint precision convolution neural network model similar to the binary network structure on the Image Net dataset,and then the obtained parameters is used to initialize the binary neural network model.After initializing the binary network,the dimension reduction operation is introduced to calculate the correlation coefficient between convolution kernels in the network.By deleting some convolution kernels with high feature repeatability,the calculation redundancy in the training process is reduced and the final inference accuracy of the network is improved.In order to verify the effectiveness of the binary neural network model,this thesis puts the final reasoning work on the FPGA chip zynq-7020 provided by Xilinx company,carries out the reasoning acceleration experiment of the neural network algorithm,and finally verifies the feasibility of accelerating the neural network algorithm model on the FPGA platform.The binary neural network model proposed in this thesis has 55.9% top-1 and 78.6% top-5 accuracy on Image Net.Compared with other acceleration platforms such as CPU and GPU,the FPGA acceleration framework designed for binary neural network achieves 373.3 times and 28.7 times energy efficiency improvement respectively. |