Font Size: a A A

Research On GPU Warp Scheduling Algorithm Optimization

Posted on:2019-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:C Y FanFull Text:PDF
GTID:2348330545475154Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
With the development of integrated circuit industry,GPU computing capabilities and programmability have been continuously improved.Especially after the emergence of programming environments such as CUDA,the complexity of GPU general-purpose computing has been greatly reduced,and its programmability,functions,and performance have all been significantly improved.GPU have evolved into a programmable,high-performance parallel computing resources that have been used extensively in general-purpose computing where large amounts of computation are required.Currently,in the field of general-purpose computing,resource utilization of the GPU is low.Long delay operations such as off-chip memory access are important cause of low utilization of GPU computing resources.The typical warp scheduling algorithm cannot hide long delay operations well.In this paper,I analyzed various regular Warp scheduling algorithms.Each warp in Round Robin scheduling algorithm has the same priority so that each warp arrives at a long delay instruction at the same time,so there is no redundant Warp to hide the long delay.Greedy scheduling algorithm has a slightly better ability to hide long delay instructions,but it has locality loss,reduces the cache hit rate,and generates more off-chip fetches.In this paper,I designed a two-level scheduling strategy based on greedy algorithm to solve the problems mentioned.The long delay operations are concealed by two methods:Greedy scheduling within the group and two-level scheduling.Two-level scheduling use the form of grouping to better hide long delay operations.The scheduling unit selects a group for scheduling and issues Warp instructions in the group.Once the Warps in the group are completely blocked,Round Robin scheduling algorithm is used to select other groups for scheduling.The two-level scheduling strategy adopts a grouping method to prevent all Warps from blocking at the same time due to long delay operations,only one group will block at a time,Warps in other groups can continue to schedule and execute.Warp in the group uses Greedy algorithm scheduling,which avoids each Warp reaching long delay instructions at the same time,further playing the effect of hiding long delay operations.After simulation,this algorithm has a 7.6%performance improvement compared to Round Robin scheduling algorithm.For some applications,the algorithm has 11.2%performance improvement.
Keywords/Search Tags:GPU, warp scheduling, Greedy scheduling, Two Level scheduling
PDF Full Text Request
Related items