Font Size: a A A

An Auto Performance Profiling For Parellel Programs On LLVM

Posted on:2022-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ZhaoFull Text:PDF
GTID:2518306752454354Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the more close integration of computer and all aspects of society than before,computers are used to solve all kinds of scientific problems frequently.These applications are characterized by the need to perform efficient operations on massive amounts of data.In order to solve the problem of mass computing,parallel computing was proposed,which is a method of solving problems by combining multiple processors to form a supercomputer.As the structure of parallel programs and supercomputers becomes more and more complex,the gap between the actual performance and the expected performance of parallel programs appears.Although the computing power of various supercomputing platforms has increased rapidly in recent years,the machine utilization of parallel programs has not been improved.This shows that the performance of parallel programs has great potential to be exploited.Accurately measuring the performance of parallel programs can effectively analyze the running state of programs,help to find performance bottlenecks,and improve the efficiency.However,due to the complexity of parallel programs,the diversity of programming languages,the difference of individual programming habits,and the heterogeneity of high-performance computing platforms,it is very difficult to accurately measure the performance of programs.In addition,profiling requires not only recording program performance,but also storaging and locating the programs running information efficiently.These steps will increase the overhead of programs,modify performance characteristics of programs,and affect the accuracy of profiling results.Therefore,how to achieve low cost,high accuracy profiling is a tough challenge.In this paper,a low-cost and high-accuracy profiling tool,LPerf,is proposed.It can obtain the accurate running time of the functions in the program and locate the comupte-intensive functions and communicate-intensive functions with low cost.The main contributions of this paper are as follows:·A preprocessing method is proposed to reduce the overhead caused by LPerf to the source program at runtime.In addition,LPerf realizes automatic instrumentation with adjustable granularity,allowing users to balance measurement accuracy with overhead.·An aggregated parent-child call relationship is proposed,which enables LPerf to efficiently locate the call relationship of functions during runtime.The Aggregated parent-child call relationships can reduce memory and time overhead,and redblack trees are used as data structures to store call relationships to further accelerate locating.·A profiling-aware method which can precisely calculate running time of functions is proposed to eliminate the error caused by the running time of LPerf,so as to realize the accurate measurement of running time of functions.In this paper,the performance of LPerf is verified on a widely used benchmark program and a large-scale scientific computation program.Experimental results show that LPerf achieves high accuracy in profiling with low cost.The error rate and time delay of the measurement results are 0.02%and 1.6%respectively.Compared with the comparison object,LPerf achieves a better level of accuracy,precision and overhead.
Keywords/Search Tags:LLVM, Instrumentation, Profiling, Parallel computing
PDF Full Text Request
Related items