Font Size: a A A

Research On Analysis And Optimization For OpenMP Program

Posted on:2010-05-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:H T MaFull Text:PDF
GTID:1118330332978531Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Human beings are facing the third software crisis, with the widely use of multi-core processors in multi-media, embedded equipments, personal computers, commercial services and high performance computations. Though many program models are presented, none has shown superiority than the others.OpenMP is a standard program model supported by many hardware vendors (DEC, Intel, IBM, SGI, DOEASCI, etc). The advantage of OpenMP is that it's easy to program, and can be developed incrementally, and supports many programming languages, and it also can run on different operating systems (UNIX, Linux, Windows, etc). With commercial compilers being able to compile OpenMP programs, more and more attentions have been paid to analyze and optimize OpenMP programs.This thesis expatiates on some technologies to validate the correctness and optimize the performance of OpenMP programs. Based on a thorough study of the Single Static Assignment form, an algorithm is presented to analyze the May-Happened-in-Parallel information of OpenMP based on the Barrier SSA. And with the technologies developed by automatic parallelization, such as dependence analysis, data and computation decomposition, a method is presented to optimize OpenMP programs at loop level. The main contributions of this thesis are as follows:1. With the new concept of Dominance Frontier Inverse, a multi-variable ? -function placement algorithm is presented, which can directly work on the dominator tree. From the perspective of Dominance Frontier Inverse, the level of a node in the set DFI(n) must be less than the level of n on the dominator tree. The algorithm visits the dominator tree in a bottom-up traversal base on the concept of Dominance Frontier Inverse, and place (?) -nodes for each level of join nodes. Compared to the traditional algorithms, the multi-variable algorithm is more effective without the computation of Dominance Frontier, and the cost of constructing the Barrier SSA is also reduced.2. The rule to judge whether two nodes can run in parallel is presented based on the virtual thread number which is extended with the node in OpenMP parallel flow graph. According to the syntax of OpenMP programs, a virtual thread number is assigned to each node. If the virtual thread number of two nodes is not the same, then the two nodes can run in parallel. The advantage of the extended OpenMP control flow graph is that it doesn't need to build two copies of the control flow graph for parallel construction.3. Based on a deep resarch on Single Static Assignment, a May-Happen-in-Parallel analysis is presented on the form of Barrier SSA. We take the barrier statement as a local variable defined in a parallel region, and construct the SSA form of barrier variable to explicitly represent the program phases. With the use of the Barrier SSA, MHP analysis doesn't need to travel the OpenMP flow graph up and down for several times, and can also be maintained by incremental. 4. Incorporated with the technologies developed by automatic parallelization, a parallel region restruction algorithm is designed and implemented. A larger SPMD region is got by merging the two adjacent parallel constructions, and expanding the inner parallel construction to outer loop. And then the cost of the barrier is eliminated or replaced with the help of the technologies, such as, scalar analysis, dependence analysis, and data and computation decomposition. Experimental results show that the performance of the program is improved.5. An algorithm of data and computation decomposition of computation priority is presented to optimize barriers in OpenMP programs. According to the directive of OpenMP programs, we get the computation decomposition, and then we try to find a consistent data decomposition. With the information of data and computation decomposition, the algorithm can reduce the cost of control synchronization, and also make it possible for the backend to optimize data synchronization.
Keywords/Search Tags:OpenMP, Single Static Assignment, Dominance Frontier Inverse, Barrier SSA, May-Happen-in-Parallel, Data and Computation Decomposition, Barrier Elimination and Replacement
PDF Full Text Request
Related items