Font Size: a A A

Design Of Reconfigurable Convolutional Neural Network Accelerator And SOC System

Posted on:2021-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330611967333Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
In recent years,the rapid development of convolution neural network(CNN)makes it more accurate than traditional algorithms,so it has been widely used in many fields.However,in the process of continuous pursuit of better accuracy performance,the complexity of the CNN and the size of the CNN model are also increasing.A large number of data movement and complex computing in CNN bring great challenges to the power consumption and performance of terminal devices.It hinders the application deployment of CNN in smart phone,smart car and smart home field.Therefore,the research on the hardware acceleration of CNN on the embedded terminal is of great practical significance for the landing of the CNN.In addition,in order to facilitate development and application,it is also necessary to provide a friendly hardware acceleration system development environment,which can quickly and seamlessly deploy CNN applications to embedded platforms.In order to solve the above two problems,a reconfigurable CNN accelerator and system-on-chip(SOC)system design are proposed in this paper.The main work is as follows:(1)FPGA-based reconfigurable accelerator of CNN.Firstly,the parallelism and difference of convolution layer in CNN are analyzed.In view of the differences,the reconfigurable convolution computing unit is designed to support the convolution kernel of different sizes and realize the efficient utilization of hardware resources.Then,based on the reusability of the data in the input feature graph,an input cache module that is suitable for different convolution kernels is designed.In addition to the efficient utilization of data reusability,effective convolution windows can be efficiently generated under different input sizes and different convolution kernels.In addition,according to the limitation of the fixed convolution parallel scheme,a flexible and customizable computing mode is designed.The accelerator can realize the different mapping of input feature graph on the computing unit through configuration parameters.Different mapping means different parallel computing modes,so as to realize the reconfigurable computing mode design.The experimental results show that the average energy efficiency ratio of the proposed accelerator under different layer structures is 27.2GOPS/W.the performance of the proposed accelerator is 17.3 times faster than that of Intel(R)Core i7-7700.(2)SOC of integrated CNN accelerator and a rapid deployment platform.In order to solve the problem of low efficiency and slow speed,the traditional SOC system based on CPU is used to deal with the CNN.In this paper,the designed reconfigurable CNN accelerator is integrated into the SOC.In addition,based on this SOC system,a rapid deployment platform of CNN is built in this paper.Algorithm researchers can quickly deploy the CNN model to the SOC system through the application program interface(API)provided by the platform.The platform mainly consists of two parts: model quantification and transformation tools and rapidly deployable application interfaces.Finally,the construction of the whole system is completed,and the function of the system is verified by experiments,and it shows that the system designed in this paper has good flexibility,versatility and expansibility.
Keywords/Search Tags:convolutional neural network, hardware accelerator, FPGA, system on chip
PDF Full Text Request
Related items