Font Size: a A A

An Embedded Inference Framework For Deep Convolution Neural Network:Design And Implementation

Posted on:2021-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2428330611966931Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since 2012,deep convolutional neural networks have greatly promoted the development of computer vision algorithms.In order to achieve better performance,researchers have designed larger and larger convolutional neural networks,from Alex Net to VGG,Google Net,and Res Net.Due to privacy or network reasons,the data on the embedded or mobile platform cannot be sent to the server side.So,the deployment of convolutional neural networks on embedded platforms has become a new research trend.Convolutional neural networks with a large number of parameters are not suitable for embedded platforms with limited computing resources.So,there are some lightweight convolutional neural networks designed specifically for embedded mobile platforms such as Mobile Net,Shuffle Net and Squeeze Net.However,these lightweight convolutional neural networks are trained on a desktop platform.If these lightweight convolutional neural networks are run on an embedded platform directly,their performance is not high.Therefore,this paper designs and implements an inference framework on embedded platforms for deep convolutional neural networks to solve this problem.This paper designs and implements an embedded inference framework of deep convolutional neural networks suitable for the embedded platforms which have ARM CPUs.The whole framework mainly includes four components: model conversion component,runtime,basic network components(Net,Layer,Blob),and acceleration components.We analyze the performance bottlenecks of lightweight convolutional neural networks in the inference process on embedded platforms through theory and experiments,and design corresponding optimization algorithms for 1 × 1 standard convolution and 3 × 3 depthwise-separable convolution to improve the performance of these network layers.A memory pool algorithm is also designed to solve the frequent memory allocation and deallocation problems introduced by the lightweight inference mode of the convolutional neural networks,improving the performance of the entire framework.Finally,the convolutional neural network inference framework designed and implemented in this paper is tested experimentally on the Firefly-RK3399 development board.Experimental results show that our 1×1 convolution optimization algorithm has a performance improvement of about 70%-90% compared to the performance before optimization.In the case of a large amount of calculation and a weak CPU,the 3 × 3 depthwise-separable convolution optimization algorithm has a performance improvement of about 50%.And our memory pool algorithm,with the cooperation of lightweight reasoning mode,can reduce the memory footprint in the inference process without losing the inference performance.
Keywords/Search Tags:Embedded deployment of convolutional neural network, Forward computing framework of convolutional neural network, Optimization on embedded platforms, Optimization of convolution
PDF Full Text Request
Related items