Font Size: a A A

Optimization And Implementation Of CNN Image Recognition Algorithm Based On Zynq

Posted on:2022-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2518306557965539Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Convolutional neural network originated from the study of mammalian visual system,with the continuous development of computer hardware and artificial intelligence theory,its advantages in large-scale image processing have emerged gradually,but its substantial calculation and storage has brought significant challenges to the realization of CNN on the embedded platform.As a reprogrammable device,FPGA’s low power consumption and high parallelization make it unique advantages in the implementation of CNN.The Xilinx’s Zynq SoC platform is used in this thesis to explore the targeted optimization of CNN on the FPGA with limited on-chip resources to achieve a high-performance,low-power image recognition embedded system.The image recognition system is composed of an image acquisition system and a CNN accelerator.The image acquisition system realizes three functions of image acquisition,storage and display.The initial configuration of CMOS image sensor and the extraction of RGB format image data are realized in the aspect of image acquisition.The read and write control of AXI bus is implemented in the aspect of image storage,and the image data in RGB format is written into and read from the external memory of DDR3.The HDMI display IP is realized in the aspect of image display,and the image data read from the memory of DDR3 is displayed on the screen.Aiming at the problem of video tearing,a double buffer mechanism is designed to improve the fluency of video display.The CNN accelerator utilizes data quantization to transform the CNN parameters from 64-bit double-precision floating-point numbers to 16-bit fixed-point numbers.According to the characteristics and requirements of different layers of CNN,different network structures and strategies are designed.The operations of the convolutional layer and the fully connected layer are accelerated by the methods of loop tiling,loop pipeline and loop unrolling,and the pooling layer uses the pipelined optimization.A caching strategy for FPGA and the external memory is designed to reduce the amount of data transfer between FPGA and the external memory.The data communication interface between the process system and programable logic of Zynq is designed to improve the data transfer rate.The integration of the image recognition system is implemented in the SDSoC development environment in this thesis,followed by the functional and performance tests on the Zynq-7035 platform.The experimental results show that at the working frequency of 100MHz,the frame rate of the image recognition system to collect 720p resolution images is 30fps,verified with the Flowers Recognition data set,the flower recognition accuracy rate of the system can reach 83.3%,and the system power consumption is only 3.282W.The average recognition time of an image of the CNN accelerator is 1.9827s,which is 656.74 and 21.71 times faster than the ARM Cortex-A9 single-core and Intel i7-9750H CPU solution respectively.
Keywords/Search Tags:Zynq, Convolutional neural network, Image recognition, Hardware acceleration
PDF Full Text Request
Related items