MPSoC Power Optimization Strategy For Deep Learning

Posted on:2021-07-31

Degree:Master

Type:Thesis

Country:China

Candidate:X J Wang

Full Text:PDF

GTID:2518306512987279

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Multi-processor System-on-chip(MPSoC)usually consists of multiple processing units,memory and communication infrastructure.Heterogeneous MPSoC contains different types of processing units,such as the Central Processing Unit(Central Processing Unit,CPU)and Graphics Processing Unit(GPU),so it can optimize computing performance,cost,and power by mapping tasks to specific processor types.MPSoC gradually integrates technologies such as deep learning,bringing huge changes in various fields such as military,industrial and social life.However,the gradually increasing task load of MPSoC will lead to an increase in power consumption and reduce the real-time and reliability of the system.If the size of the deep learning algorithm is reduced,its accuracy will be reduced,but directly increasing the cooling capacity and computing performance of edge computing This leads to increased costs and reduced portability.So how to balance the computational requirements of deep learning models with the limited power and heat dissipation capabilities of MPSoC is a subject worthy of study.Heterogeneous computing technology is a technology that allocates tasks of different levels of parallelism or data size to processors of different architectures to improve the processing performance and efficiency of the system.The Big-Little architecture dynamically balances system performance and power consumption by rationally distributing tasks between pairs of high-performance processors and low-power processors.This paper is inspired by heterogeneous computing technology and Big-Little structure.Through experiments on NVIDIA's Jetson Tegra X2(hereinafter referred to as TX2),the system functions of different layers of deep learning models on different cores on MPSoC are studied.Consumption and inference speed.Based on the experimental results,a deployment strategy is proposed to reduce the power consumption of the deep learning model while running on MPSoC without changing the structure and parameters of the original model.Finally,a dynamic scheduling framework optimized for power consumption is proposed and a comparative experiment is performed through an actual application scenario to verify the effectiveness and feasibility of the framework.The specific work is as follows:1)Through the single-layer migration,group migration,and mixed migration experiments performed on each layer of Yolo V3-Tiny,a system that measures different types and different scales of operations when running on different computing cores of TX2 without exhaustive conditions Power consumption and inference speed.2)Based on the results of migration experiments,summarize the rules of system power consumption and inference speed after the convolutional layer,pooling layer,detection layer,channel layer,and upsampling layer are migrated from the GPU to the CPU,and propose a method for deploying the upsampling layer to An algorithm for migrating the pooling layer of the CPU at the same time to reduce the system power consumption of Yolo V3-Tiny when running on TX2.3)A power-optimized dynamic scheduling framework is proposed.The feature of this framework is that it can improve the model's inference speed by reducing the method of processing similar images and use this part of the inference speed to make up for the inference speed of migrating layers to the CPU.Loss,to achieve more reduction of system power consumption with less loss of reasoning speed.

Keywords/Search Tags:

MPSoC, heterogeneous computing technology, GPU, convolutional neural network, scheduling framework

PDF Full Text Request

Related items

1	Application Research Of Convolutional Neural Network Based On Heterogeneous Computing Systems
2	Design And Implementation Of Deep Convolutional Neural Networks Acceleration System Based On Heterogeneous Processor
3	Research On Convolutional Neural Network Acceleration Framework For Cloud-based FPGAs
4	Design And Implementation Of Neural Network Computing Framework For ZYNQ SoC Embedded Platform
5	Research On Scheduling Algorithm Of Heterogeneous Computing Platform Based On Reinforcement Learning
6	Research And Implementation Of Heterogeneous Computing Based On FPGA
7	Research On Acceleration Method Of Deep Convolutional Neural Network Based On Heterogeneous Computing Platform
8	Hierarchical scheduling and uniform access programming frameworks for heterogeneous CPU-GPU computing cluster
9	Research On Parallel Acceleration Architecture Convolutional Neural Network Based On FPGA
10	An Embedded Inference Framework For Deep Convolution Neural Network:Design And Implementation