Font Size: a A A

A GPU Resource Pool For Remote Sharing Based On API Remoting

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2518306503973929Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of technology,more and more applications have taken advantage of the high degree of data parallelism and strong floatingpoint performance of GPU,like traditional image rendering,video encoding,and decoding,also including emerging fields such as data mining and machine learning.Personal computers and commercial servers can no longer meet the increasing computing requirements.More and more individuals and enterprises choose to deploy programs into the cloud,and GPU acceleration has become a big selling point for many cloud vendors.Many companies have equipped their servers with GPU equipment in their data centers and provided corresponding GPU cloud services to provide high-performance GPU acceleration to their thousands of tenants.However,due to business cost and energy constraints,it is less feasible to provide GPU devices to every node in the data center.To solve such problems,GPU virtualization technology came into being.Different from matured CPU virtualization technology,GPU virtualization has always been a hotspot and difficulty in virtualization technology.Due to the specialty of GPU devices,traditional IO device virtualization methods cannot be applied to GPU devices.At present,most cloud vendors provide GPU devices to tenants through Pass-Through approach.This coarse-grained virtualization approach will lead to low utilization of GPU devices and poor scalability.On the other hand,because the device is directly exposed to the guest machine,the virtual machine monitor could not monitor the GPU device.Therefore,how to effectively improve the utilization of GPU devices and maintain the performance management of GPU devices is a difficult challenge.Based on the characteristics of the GPU programming model and the solutions of existing GPU virtualization technology,this paper proposes g Pooling,a scalable GPU remote-resource sharing pool based on API remoting technology.By taking advantage of the GPU programming interface characteristics,g Pooling breaks down device barriers of different vendors,and implement a set of device-independent GPU remote-resource sharing pool that is transparent to applications.Our work makes the following contributions:(1)Design a GPU remote-resource sharing pool based on API remoting technology that is independent of the hardware and transparent to the software.It can aggregate the existing GPU cluster resources and provide rendering acceleration for different devices.A single GPU device in the cluster can be virtualized into multiple v GPU devices,providing different acceleration capabilities for multiple clients.It is of great flexibility,strong isolation,and high scalability.(2)According to the large amount of data transmission in the g Pooling framework,we analyzed the data usage in the network transmission process.Thus,g Pooling has designed and implemented a transmission framework that includes command stream compression and frame image encoding,which effectively reduces network bandwidth and improves the scalability of g Pooling.(3)To solve the load balancing problem of GPU cluster resource usage,g Pooling design a multilayer feedback scheduling algorithm.Different levels in the cluster have a different degree of visibility of resource usage.Multi-layered feedback scheduling utilizes cluster-level and device-level information interaction and feedback to get a reasonable task distribution.According to the characteristics of GPU applications,different types of programs are distributed to specific servers.The different resources of the devices in the cluster are load-balanced.The utilization rate of the devices is improved,and the high degree of parallelism of the GPU is maintained.Our evaluation shows that g Pooling can provide GPU acceleration to up to 40 clients.Compared to AWS Elastic GPU,because g Pooling use finegrained isolation,g Pooling effectively avoid resource competition between applications and achieve higher scalability.Also,experiments show that command stream compression and frame image encoding can effectively reduce the use of bandwidth resources.The experiment proves that the multi-layer feedback scheduling algorithm avoids the racing of the same resources among the same type of application,and satisfies the purpose of resource load balancing.
Keywords/Search Tags:GPU Virtualization, Open GL Acceleration, Cloud Computing, API Remoting
PDF Full Text Request
Related items