Font Size: a A A

Analysis and optimization with programmer-controlled memory consistency modeling of OpenMP scientific codes

Posted on:2005-07-27Degree:Ph.DType:Dissertation
University:University of DelawareCandidate:Hisley, Dixie MFull Text:PDF
GTID:1458390008980495Subject:Computer Science
Abstract/Summary:
The main contribution of this dissertation is an investigation of how classical compiler analysis and optimizations for sequential programs can be effectively extended to be used for compiling scientific codes written in the explicitly parallel shared memory standard OpenMP, with the added flexibility of a programmer-controlled memory consistency model guaranteed for different parts of the program. A new combined memory consistency model is developed to overcome the unnecessary synchronization overhead imposed by a purely sequential memory consistency model approach. Program slicing is used to perform program decomposition, so that either location or sequential memory consistency models can be supported for different parts of the same program. Experimentation using programmer-controlled and sequential memory consistency models showed that a factor of 2 speed-up can be achieved by using programmer-controlled memory consistency over sequential consistency modeling for some benchmark codes running on 32 processors. This comes at a cost of negligible space overhead at analysis time as the concurrent control flow graph is also the main data structure for compiler optimizations.; While static interprocedural slicing for sequential codes is well understood and used in a variety of applications, there are no algorithms yet developed for static interprocedural slicing of shared memory parallel programs. To facilitate the static slicing of parallel programs, a new intermediate program representation, the threaded System Dependence Graph (tSDG), is developed to encompass the parallel and worksharing constructs utilized in OpenMP. The concept of transitive dependence is redefined to include dependences caused by conflict edges in shared memory parallel programs, thus enabling interprocedural slicing of the new program representation. Examples of interprocedural slicing over the tSDG representation will be presented. Another contribution of this dissertation is the development of an aggressive optimization algorithm, called CSSAPRE, for performing partial redundancy elimination (PRE) for explicitly parallel programs, using concurrent static single assignment (CSSA) form as a representation of the program. The overarching implication of this research is explicitly parallel shared memory computational simulations that will be more efficient and optimized than current shared memory parallel simulations.
Keywords/Search Tags:Memory, Program, Sequential, Openmp, Codes, Interprocedural slicing
Related items