Font Size: a A A

Research On CNN Deployment For Embedded Heterogeneous Computing Platform

Posted on:2023-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2558306911483184Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of deep learning technology,represented by convolution neural network technology is gradually enable in the edge of the end equipment,affecting People’s Daily life and industrial production,when the convolutional neural network model of reasoning tasks deployed in the embedded platform and mobile terminal equipment,due to the limitation of the processor hardware volume and power consumption.It cannot meet the requirements of high computation and low latency in some application scenarios.More and more embedded platforms in the market use heterogeneous computing methods to accelerate neural network reasoning,including the combination of CPU and GPU and other processors with different architectures for collaborative computing,which can effectively improve the computing capacity of the system for neural network reasoning tasks.Existing neural network inference frame deployment on embedded platform mostly adopt the kernel level parallel strategy will be deployed on multi-core processors reasoning tasks,ignoring the embedded heterogeneous computing platform different processors exist large differences in neural network reasoning performance,and communication overhead between processors will effect the performance of the reasoning,the reasoning performance optimization space still exists.In view of the above problems,the main contents of this paper are as follows:(1)Based on embedded hardware features of heterogeneous computing platform,ARM-CL reasoning framework for the expansion and optimization,implements the convolutional neural network model of reasoning tasks according to the classification task for many jokes of layer level,and adopt the way of asynchronous line parallel to more jokes tasks were completed placed on different processors cluster computing,It improves the resource utilization of heterogeneous system and effectively reduces the communication overhead caused by data transmission between processors.(2)Because the characteristics of pipeline determine that the overall performance of tasks is limited to the stage with the longest execution time,in order to further improve the reasoning performance of neural network model,it is necessary to formulate a reasonable task scheduling strategy to place sub-tasks on each processor for execution,so as to balance the tasks assigned by each processor as efficiently as possible.This paper proposes a kind of neural networks performance prediction model based on LSTM,through the network parameters of neural network and processor hardware parameters to predict its reasoning task execution time,in the task scheduling according to predict the execution time of the balance during each processor quota,compared to the actual execution can carry on the dynamic balance at a higher efficiency.The experimental results show that the prediction accuracy of the performance prediction model is more than 90% for various lightweight CNN models.(3)For heterogeneous polynuclear system structure complexity caused by task scheduling strategy search space is too large problem,this paper proposes a task scheduling algorithm based on monte carlo search tree,the algorithm based on monte carlo tree search task scheduling strategy for the establishment of basic idea,in the search process according to the forecast results of performance prediction model is constantly adjust the task scheduling strategy.Experimental results show that the monte Carlo tree search based task scheduling algorithm is 67.9% shorter than the traditional task scheduling algorithm on average.Compared with the default strategy of the inference framework,the throughput of the optimal task allocation strategy obtained by the algorithm increases by 30.6% on average.It is proved that the task scheduling algorithm proposed in this paper has theoretical research significance and engineering application value.
Keywords/Search Tags:Embedded, Heterogeneous Computing, Task Scheduling Algorithm, Convolutional Neural Network, Performance Model
PDF Full Text Request
Related items