Font Size: a A A

Key Technology Research Of GPU Cluster Scheduling Management System

Posted on:2012-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:W L LiFull Text:PDF
GTID:2218330362456307Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Human face many important scientific and technological issues nowadays, such as satellite imagery data processing, genetic engineering, accurate global climate forecasting, nuclear explosion simulation. The data has reached the scale of TB or even PB magnitude, and often requires more than trillions of times computing power. Our daily life (such as games, high definition video playback) faces graphics and more complex data calculations, so it's a severe challenge to calculation speed.Over the past twenty years, the performance development speed of CPU introduces by Intel and AMD is getting slower and slower, and it had various problems in multi-core development. NVIDIA's GPU supporting CUDA technology has incomparable advantages in high-performance computing, and GPU cluster is a sensible choice to meet growing demand for high performance computing.This paper discusses a GPU cluster scheduling management system design which is a combination of hierarchical scheduling and centralized scheduling according to the characteristic of GPU computing. The hierarchy, the workflow, each module design ideas and interaction of modules is described in detail. The project is applied to make full use of GPU resources and to organize the GPU execution effectively in order to assure the linear development of total computing capability in context of linear development of GPU amount.Some kind of hash algorithm is designed to verify the feasibility of the project with a certain hardware, software and network communication interface platform. Corresponding optimization is also done according to the specific task in the experiment. It results that the system has high efficiency, good scalability and friendly operability.On the basis of the experiment, this paper cites a number of factors that may affect system performance in order to optimize system performance. Then it modifies a few parameters and makes some tests. We can verify the effect of enumerated factors by theoretical analysis and test result.Finally, this paper puts forward the future work of design project on account of the experiment results. We hope to boost the fault tolerance, the restorability and the multiple computing performance adaptive capacity of the system.
Keywords/Search Tags:Graphic Process Unit, cluster, high-density computing, scheduling management, efficiency, scalability, operability
PDF Full Text Request
Related items