Font Size: a A A

Design And Implementation Of A Hardware Accelerator For Binary Neural Networks

Posted on:2021-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z W YangFull Text:PDF
GTID:2518306128452134Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Today,convolutional neural networks(CNN)are widely used in artificial intelligence(AI)applications such as computer vision,speech recognition,and robotics.Although CNNs can provide state-of-the-art accuracy on most AI tasks,it usually comes at the cost of high computational complexity.Therefore,it is critical to improve energy efficiency and throughput while ensuring the accuracy without increasing hardware costs.In algorithm level,binary neural networks(BNN)have become a dark horse for its great advantages in reducing the amount of data and computing costs.In hardware level,due to the energy-efficieny and potentially customizable features,FPGAs have become a good choice in hardware accelerator design.However,there is still large research space on FPGA-based BNN-specific hardware accelerators,e.g.,convolutional layer,composite layer,and edge "0" data calculation.In view of the above,this dissertation has proposed an FPGA-based energy-saving,low-cost BNN hardware accelerator solution.The main work includes the followings:(1)This dissertation proposed a full binary preprocessing scheme based on quantization and threshold optimization.Aiming at the problem of calculating the first convolutional layer of the BNN,the image is preprocessed to be identified before BNN inference,and the bit width of the input image data is quantized to 1 bit.At the same time,in order to reduce the impact of quantization on BNN recognition accuracy,threshold optimization was performed during the training process,and the best threshold was selected as the threshold used in the inference process.This solution binarizes the input image,makes the first layer of input convolutional layer consistent with the calculation form of other convolutional layers,and reduces the power consumption and hardware cost of traditional solutions.(2)This dissertation proposed a combined calculation scheme based on composite offset and 6: 2 compression.This dissertation analyzed the continuity of the convolutional layer,batch normalization layer,and activation function layer calculations in the BNN,combined and changed the calculation formulas of these three layers,combined with the characteristics of FPGA hardware resources,and proposed a set of efficient And inefficient algorithms.Cost BNN hardware calculation solution.The 6:2 compression multiply-accumulate calculation unit is designed.Under the same calculation amount,the LUT hardware resource consumption is reduced by 60%.At the same time,this dissertation combined and optimized the convolutional layer,batch normalization layer,and activation function layer,and designed a compound deviation.Compared with the traditional solution,the calculation unit reduces the data storage capacity by up to 80%,the calculation cycle is reduced by 500%,and the hardware resource consumption is reduced by 600%.(3)This dissertation proposed a edge transition scheme of low power consumption without sacrificing accuracy.Aiming at the problem that BNN hardware cannot identify edge "0" data in the algorithm,the specialty of "0" in BNN calculation is analyzed.The method of skipping edge "0" data calculation can be realized without storing and calculating edge data.The result is the same as the algorithm containing the edge "0" data,and the recognition accuracy and algorithm of the hardware are the same.Experiments show that this scheme can reduce the data storage amount by 20% and the calculation amount by 30%.At the same time,compared with the algorithm,the accuracy loss is zero.
Keywords/Search Tags:Convolutional Neural Network, Binary Neural Network, Hardware Accelerator
PDF Full Text Request
Related items