Font Size: a A A

Optimization Methods Of NUMA-based Cloud Platform For Deep Learning Load

Posted on:2021-01-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:M J HeFull Text:PDF
GTID:1488306464957079Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
During the 13 th Five-Year Plan period,the data center has developed rapidly,and more and more value-added services are carried on it.In this process: On the one hand,more and more data centers use cloud computing platform based on NUMA virtualization parallel architecture;On the other hand,with the rapid development of artificial intelligence,cloud platform needs to be able to carry multi-tenant and multi-task applications oriented to deep learning load,and has a computing framework that can share deep learning processor(NPU)resources.However,there are three problems:(1)There is no parallel computing framework with virtualization function for NUMA architecture;(2)There is no cloud computing platform aimed at deep learning load,multi-tenant and multi-task applications and resources sharing for different types of computing jobs;(3)There is no optimization technology of multi deep learning computing framework with deep learning processor resources sharing function aimed for reasoning tasks in terms of efficient optimization needs of deep learning reasoning tasks.Therefore,to solve these three problems,the thesis focuses on the memory access optimization of NUMA virtualization parallel architecture,the container quick start of flux cloud platform and the optimization of computing framework for reasoning tasks.These three aspects are interrelated.And through the research on the virtual machine access and storage optimization,the cloud platform is provided with a carrier of parallel computing;Through the research on container quick start technology,the management cost of cloud computing platform oriented to the deep learning load is reduced;Through the optimization of the deep learning reasoning tasking computing framework,the virtual parallel computing system based on NUMA architecture is effectively occupied.The research of this thesis has theoretical and practical significance.The main work of this thesis includes:(1)Memory Access Optimization.In NUMA architecture,the processor memory access bandwidth of each node is different,which,coupled with the impact of virtualization management layer,leads to cross NUMA node memory access of virtual machine processor,which causes the decrease of computing efficiency of cloud platform.In this thesis,a memory access optimization scheme of NUMA virtual machine based on process binding is proposed for the first time.This thesis focuses on the process binding technology of processor,memory preasignment technology of virtual machine adapted to host computer,and virtual machine scheduling strategy based on ensuring the availability of resources.Experimental results show that the performance of this scheme is improved by 20?120% compared with the primary cloud platform.The research results have been applied to the cloud computing operating system Cloudview which is a product of Dawning Information Industry Co.,Ltd.(2)Container Quick Start.Container startup speed is a key factor affecting the performance of cloud computing platform.At present,the computing load represented by deep learning applications has changed from software services to microservices,which leads to a sharp increase in the number of containers and the decline of container response speed.Therefore,aiming at the three objectives of starting container in time,shielding service and computing resource details reasonably,and dynamically expanding and shrinking computing service with high efficiency,this thesis proposes a container quick start technology with service scaling and isolation mechanism,focusing on key container generation and startup algorithm,runtime service container control algorithm and container size dynamic scaling control algorithm.The experimental results show that the start-up time of the container is shortened by 30% compared with before optimization,and the overall operation efficiency of the job is increased by 53.8%with the multi container scheduling technology.The research results have been widely applied for many products of Dawning Company,including container cloud platform Appfoundry,cloud computing operating system Cloudview,artificial intelligence platform Sothis AI and monitoring management operation and maintenance platform Grid View.(3)Optimization of Computing Framework.The utilization rate of computing resources is an important performance index of cloud computing platform.Deep learning is a compute-intensive task.How to efficiently schedule the deep learning processor(NPU)to meet the computing requirements of deep learning training and reasoning tasks from different tenants and different applications is crucial.This thesis breaks the boundaries between different tenants and different applications in the use of resources,innovatively proposes and implements a virtualization system based on NPU resource pooling.It focuses on the special job scheduling and acceleration platform,NPU resource pooling method and NPU fine-grained scheduling method.The experimental results show that the efficiency of reasoning tasking can be improved by493% with different models and parameters,and 915% with different network structures.The research results have been applied to Sothis AI,which is a deep learning service platform oriented to artificial intelligence.To sum up,this thesis solves the key technology problems of NUMA architecture cloud platform optimization for deep learning load,including virtual machine memory access optimization research,container quick start technology and computing framework optimization research for deep learning load.The research results have been applied to software products of Dawning Company with good results.
Keywords/Search Tags:NUMA Virtual Machine, Memory Access Optimization, Container Quick Start, Deep Learning Processor Virtualization
PDF Full Text Request
Related items