Font Size: a A A

Research On Performance Skeleton Based Performance Evaluation Of Parallel Tasks

Posted on:2011-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2178330338479962Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Effective resource management and scheduling for distributed computing isessential, and the prediction of the running time of parallel tasks under differentcomputing resources is the foundation of many scheduling approaches. Prediction basedon Performance Skeleton was proposed in recent years. Firstly the original parallelprogram runs once under controlled test bed, and the communication traces would berecorded. Then a Performance Skeleton reflecting the original running feature isreconstructed according to the traces. Finally we can predict the running time of originalprogram under different computing resources according to the running time ofPerformance Skeleton. Since this method is based on running time of the actualprogram, if we can improve the various details, it is hopeful to achieve more accurateestimation than the traditional predict methods based on modeling analysis, whileavoiding some of the limitations of the predict methods that based on historical resultsor modeling analysis. This paper researched on all sub-problems of this method basedon Performance Skeleton.For the recording of running trace, this paper designed a method to access allcommunication traces of parallel programs during the runtime. Through the use ofPMPI interface of MPI library, we insert wrapper-functions to the source code, whichcould access all communication traces without changing the original program oraffecting the operation of original program. For the merging of these traces, by studyingthe characteristics of collective communication and one-to-one communication, thispaper designed a method to structure the communication traces, based on which wedesigned a trace-merge algorithm.For compressing the circulatory traces, which is the most central and most difficultin these sub-problems, this paper converted it into a circular sub-string compressingproblem, and proposed an algorithm based on suffix array. Its time performance istheoretical and practical better than the existing optimal algorithms. For automaticreconstructing Performance Skeleton, this paper addressed the problem of simulatingcalculation time and communication time, and the problem of the parameterscomplement of communication functions. This paper designed the method of automaticreconstructing Performance Skeleton.This paper designed a prediction system for distributed computing tasks based onPerformance Skeleton. This paper also carried out an experiment based on NPB whichis a standard set of performance testing applications in the cluster. In the experiment wepredicted the running time of NPB applications by the method based PerformanceSkeleton, and compared the deviation between the predictions and the actual operation results, in order to verify the validity and accuracy of these methods of the subproblems.Our experiment showed that these methods can accurately estimate therunning time of computing tasks. The error was less than 3% for homogeneous clusters,and was less than 10% for heterogeneous clusters.
Keywords/Search Tags:distributed computing, performance skeleton, parallel communicationtrace, contiguous repeats in a string, MPI, suffix array
PDF Full Text Request
Related items