Font Size: a A A

Communication Scalability Of Large-scale Parallel Computing-Analysis, Optimization And Simulation

Posted on:2014-02-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y F LinFull Text:PDF
GTID:1268330422474268Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing of system scale and the rising of processing node performance,communication has become the key bottleneck that limits the scalability of parallel com-puting. The communication scalability problem, which analyzes the factors affecting thecommunication and that to what extend the influence of these factors increases will limitthe system scalability, is one of the most challenging academic problems in the field ofparallel computing.Aiming at the communication scalability problem, this paper quantifies for the firsttime the communication wall for parallel computing from the perspective of the perfor-mance speedup, and builds the model of communication scalability. Based on the analyt-ical results of the model, this paper proposes the program optimization technique underthe guidance of the message independency and the optimization technique for multi-joballocation, for the program optimization problem and for the task allocation optimizationproblemrespectively. Atlast,thispaperdesignsandimplementsaperformancepredictionsimulatorforlarge-scaleparallelcomputing, whichcanbeusedtovalidatethecorrectnessof the model and the scalability of various system optimization techniques.Specifically, the main work and contributions of this paper are as follows:1. Building the model of communication scalability (Chapter2)Currently, there is only perceptual knowledge to rather than qualitative research onthe communication scalability problem in the world. This paper proposes for thefirst time the quantitative description of the communication wall and the commu-nication wall existing theorem. After that, this paper builds the model of commu-nication scalability, proposes the system measurement method and the classifica-tion of parallel systems based on the model, and then quantifies the extend of thecommunication scalability and the extend of the general communication scalability.Combined with some specific cases, we analyze the effects of program, topologyand optimization methods on the communication scalability, compare the extendof general communication scalability among the common topologies, and point outthe way to optimize the communication scalability and the general communicationscalability for the parallel systems. 2. Proposing the program optimization technique under the guidance of the messageindependency (Chapter3)Thecommunicationhidingtechniqueusinginstructionreorderingisoneofthemainprogram performance optimization methods. Besides the problems of its own, it isalso incurs serious network resource contentions among messages. By analyzingthe reasons for the network resource contentions, we unprecedentedly propose theconcept of the message independency and study its specific meaning. As for MPI(Message Passing Interface) programs, we build the program optimization modelunder the guidance of message independency based on instruction reordering. Byusingthemodel, wedesignandimplementaprogramoptimizationapproach, whichcan reduce the network resource contentions among messages in the prerequisiteof the maximal communication hiding. The experimental results show that, forthe CFD (Computational Fluid Dynamics) applications, this approach can greatlyreduce the communication overhead and improve the program performance.3. Proposing the optimization technique for the multi-job allocation (Chapter4)Allocating the computing resources for multiple jobs to satisfy their performanceneeds is greatly desired by the users of large-scale parallel computing systems.In this paper, we unprecedentedly propose the idea that decomposes the multi-job allocation optimization problem to two sub-problems: multi-job assignmentoptimization and single-job task mapping optimization. For the multi-job assign-mentoptimizationproblem,weunprecedentedlyproposetheclosed-minimalgraph-partitioning model, which transforms the multi-job assignment optimization prob-lem to a closed-minimal graph-partitioning problem; for the single-job task map-ping optimization, we analyze the impacts of the communication protocols on thecommunication overheads, and unprecedentedly propose the protocol-aware pro-cess mapping model for MPI programs—PaPP. Based on the above two models,we design and implement the multi-job allocation optimization approach. The ex-perimental results show that, for the NPB (NAS Parallel Benchmarks) applications,this approach has a high performance optimization efficiency.4. Designing and implementing a virtual-actual combined execution-driven simulator—VACED-SIM (Chapter5) Discrete event simulation is one of the most common performance prediction ap-proaches in the field of large-scale parallel computing. By deeply analyzing thediscrete event simulation approaches, we propose the concepts of virtual simula-tion and actual simulation. Based on the comparison between actual and virtualsimulations and that between trace-driven and execution-driven methods, our pa-per categorizes for the first time the discrete event simulation approaches into fourkindsfromtwoorthogonalaspects(simulationmechanismandevent-drivenmecha-nism). Based on the characteristics of scalability prediction for large-scale parallelcomputing, we propose for the first time the model of the fourth kind of discretesimulation approach—the virtual-actual combined execution-driven simulationapproach. With the model, we design and implement a light-weight virtual-actualcombined execution-driven simulator—VACED-SIM. In this simulator, we un-precedentedly propose and use the fine-grained activity and event defining meth-ods to increase the precision of the simulation. The experiments on the Tianhe-1Asub-system shows that, VACED-SIM has high accuracy and efficiency.
Keywords/Search Tags:Parallel computing, Communication scalability, Communica-tion primitive, Communication hiding, Communication contention, Job allocation, Communication protocol, Scalability prediction, Discrete event simulation
PDF Full Text Request
Related items