Font Size: a A A

Research Of Binarized Convolutional Neural Network And Its Hardware Design

Posted on:2020-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:P Q JiangFull Text:PDF
GTID:2518306452471734Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
In recent years,convolutional neural network model is one of the most important research direction,its size and computation surge as it develops rapidly,which makes high demands on computing devices.Binarization,as a way to solve this problem,has a great potential of compressing and accelerating convolutional neural network model.But as binarization doesn't remove all the float-point multiplications,it's still hard for binarized model to run on resource constrained edge computing devices.Based on that,this thesis improves the original binarization method by removing all remaining multiplication operations to fit binarized model for deploying on edge hardware device and designs the hardware computing architecture based on the improved method.For the binarization method,three improvements were done in this thesis: First,by deriving the principle of the scaling factor in the binary convolution network,the scaling factor is proved cannot help information propagation between layers and thus could be fused with batch normalization layer.Second,by analysing the redundancy computation in batch normalization layer,the normalization operations are simplified as bias addition.Third,the simplified batch normalization layer is further fused with binarization layer as one single comparison with bias.Convolutional neural model with the improved binarization method could perform the same as the original model without multiplication operations.The demands for computation resource are greatly lowered.For hardware design: First the space-dimension-based-parallelism in traditional neural network accelerator designs is chaged to channel-dimension-based-parallelism according to the characteristic of binarized model.The summation part in convolution is implemented by the combination of 8 bit LUT based POPCOUNT and adder tree to trade off delay and resource consumption.In the design of pooling module,the maximum value is acquired by the cooperation of recursive comparator and S-shape movement of convolution window.The binarization module compares the input with bias to acquire the binarized output.The hardware design architecture above could adapt to different size of convolution kernels and reuse hardware resource sufficiently.At last,this work implements the hardware design code of the core computation module of the binarized convolutional neural network and applies simulation on the Vivado platform.Then result shows that the design with 64 parallelism could reach an average performance of 32 GOPS under 260 MHz and only takes minimal on-chip resource usage.Binarized convolutional neural network and its hardware design architecture based on the improved method has dual universality for hardware resources and model structure.It is suitable for flexible deployment based on system resources and performance requirements on edge hardware with limited resources with well scalability and has a certain of theoretical and applicational value.
Keywords/Search Tags:binarization, convolutional neural netwrok, scaling factor, batch normalization, FPGA
PDF Full Text Request
Related items