Font Size: a A A

Design And Implementation Of Convolutional Neural Network Hardware Accelerator

Posted on:2021-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:H L WangFull Text:PDF
GTID:2428330614953595Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Convolutional neural networks are widely used in face recognition,speech recognition,document analysis,license plate recognition,image recognition and object detection.With the development of more deep convolution neural networks' applications,how to improve the performance of image processing becomes a crucial problem.At present,most of convolutional neural networks are implemented by central processing unit(CPU)or graphics processing unit(GPU),which is a flexible software implementation.But as a feedforward neural network,there is no data feedback between layers,and the algorithm is highly parallel.The general processor is not suitable for mining the parallelism of convolutional neural networks due to its own characteristics of only completing logic calculation.Considering the calculation speed and power consumption of the system,compared with the software implementation,the hardware approximation method and the fixed-point implementation method have lower energy consumption,and it also reduces processor load.The hardware acceleration scheme of convolutional neural networks is proposed,including reasonable accelerator architecture design and innovation of data transmission.Direct Memory Access(DMA)is a method of fast transmission of group data.In this paper,an innovative DMA controller dedicated to hardware accelerator of convolutional neural networks is proposed.There are four DMA controllers in the whole architecture.Each DMA controller supports single channel transmission and can work in parallel without affecting each other.A variety of operation modes are supported,including basic direct memory access mode and three-dimensional data transformation modes,which greatly improves the efficiency of the accelerator.Based on the three-dimensional data transformation modes,DMA controller also supports the data transfer between different banks in the accelerator internal memory,this function is mainly aimed at the intermediate result of convolution,if three-dimensional data transformation is required when participating in next layer calculation,the source bank data of memory can be taken out by configuring DMA controller parameters,and then the data can be transferred to the target bank after transformation,instead of being sent to external memory and then loaded in,which not only reduces the pressure of bandwidth and the probability of error,but also saves time.Field-Programmable Gate Array(FPGA)is a kind of semi customized circuit of ASIC,it not only has the advantages of rich hardware resources,flexibility and configuration,but also has the advantages of low power consumption and short development cycle,it can be used as a good platform to realize convolutional neural networks.The designed accelerator is based on FPGA for prototype verification.Through the MNIST handwritten data set and VGG16 network test,the result shows that it has a high accuracy of 98% of digital classification,and hardware acceleration effect is very significant,which is two orders of magnitude higher than the software implementation.
Keywords/Search Tags:Convolutional neural networks, FPGA, hardware, acceleration, DMA
PDF Full Text Request
Related items