Font Size: a A A

Research On Techniques To Improve The Performance Of OpenMP System On Cluster

Posted on:2005-03-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:L B ZhangFull Text:PDF
GTID:1118360185995725Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of microprocessor and high speed network technoloy, clusters of workstations or PCs utilizing COTS(Commercial off-the-shelf) components have emerged as a cost-effective and scalable alternative to high-end computing platforms. They are effectively used for scientific and engineering applications and are rapidly becoming mainstream. It is important to suggest the programming model suitable to cluster for its widely application. Traditionally, message-passing programming model is matched to cluster architecture. It is uneasy for user to write parallel program in message-passing model, while it is easy to program in shared-memory model. The OpenMP Application Programming Interface (API) is a de facto standard for parallel programming on shared memory multiprocessors. It is much easy to program and facilitates an incremental approach to the parallelization of sequential programs. OpenMP system on cluster supplies an OpenMP computing environment on cluster of workstations or PCs, which combines the friendly programmability of shared-memory with the fine scalability of cluster. In most cases, some cost-effective COTS network is used to interconnect the processing nodes of cluster. The communication overhead of cluster is usually high. So it is difficult for OpenMP on cluster to get high performance. For the widely application of cluster, it is very important to improve the performance of OpenMP on cluster.In this thesis, some techniques have been deeply studied and implemented in directive extensions to improve the performance of OpenMP on cluster. Software distributed shared memroy system have constructed the NUMA-like shared memory abstract on cluster. By analyzing the gap between UMA architecture and NUMA-like cluster architecture, the dominative factor of OpenMP programs on cluster is focused on, namely, whether data layout is matched to data access or not. A flexible data distribution directive extension, and two effective loop scheduling algorithms based on owner-computing principle, namely Locality-Based Scheduling and Locality-Based Dynamic Scheduling, are proposed to attain good performance. Because the data distribution directive is unsuitable to irregular application, the indirect directive is suggested to improve the performance of irregular applications whose kernel is the operation of sparse matrix .Some well-known benchmarks have been used to evaluate these OpenMP directive extensions in this thesis. Experimental results show that the performance of programs' version written in these directive extensions is as good as the version written in SPMD style while the programmability is as good as the loop-level style. So it is an effective programming style to program in these directive extensions on OpenMP system on cluster. To evaluate the performance of OpenMP system on cluster, this thesis compares the performance of OpenMP/JIAJIA, an OpenMP system on cluster, with that of Message Passing Interface(MPI),...
Keywords/Search Tags:cluster computing, OpenMP, Software DSM, data distribution, loop scheduling algorithm, JIAJIA
PDF Full Text Request
Related items