| Graphics Process Unit,also known as GPU,is an essential part of the computer system.The initial purpose of GPU is to take charge of workload about graphics processing,such as accelerating games.However,with the vast development of computer systems,the application of GPU is largely expanded.Besides the computation of graphics,general purpose computations like media transcoding,machine learning,etc.also can be accelerated by GPU.At the same time,the trend of cloud computing encourages an increasing number of applications deployed on the cloud,including GPU intensive applications.Thus,the cloud service providers are seeking practical and scalable GPU virtualization solutions.Unfortunately,the cutting edge GPU virtualization solution iGVT-g still suffers from the limitation of its scalability.This paper proposes gScale,a scalable GPU virtualization solution based on iGVT-g.By taking advantage of GPU programming model,gScale combines static partition and dynamic sharing into a hybrid sharing mechanism and breaks the limitation brought by hardware.To be specific,this paper introduces three approaches for gScale:(1)the private shadow graphics translation table,which enables global graphics memory space sharing among virtual GPU instances;(2)ladder mapping and fence memory space pool,which allows the CPU to access host physical memory space(serving the graphics memory)bypassing global graphics memory space;(3)slot sharing,which improves the performance of vGPU under a high density of instances.The evaluation shows that gScale scales up to 15 guest virtual GPU instances in Linux or 12 guest virtual GPU instances in Windows,which is 5x and 4x scalability,respectively,compared to iGVT-g.At the same time,gScale incurs a slight runtime overhead on the performance of iGVT-g when hosting multiple virtual GPU instances. |