Font Size: a A A

Computation Model And Performance Optimization On Shared Memory Architecture

Posted on:2011-08-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q K MiaoFull Text:PDF
GTID:1118360305466710Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the past few decades, parallel computing keep developing continuously pro-moted by the requirements of large computation problems. At the same time, the peak performance of parallel computer increases steadily. Nowadays, large scale cluster based on Chip Multi-Processor (CMP) has become the mainstream of parallel archi-tecture. Parallel computing has entered a petaflops era. However, the application level of parallel computer is very low. The performance of real application is much lower than the peak performance provided by parallel computer. Consequently, fully utilizing the computational power of parallel computer and accelerating the performance of real application have become the critical issues in parallel computing.In the future, shared memory system will be the base unit to build large scale parallel computer. This dissertation focuses on improving the efficiency of a parallel computer and bridging the gap between the real performance of applications and the peak performance of parallel computers. The main contents include parallel computa-tion model and program performance optimization techniques on shared memory archi-tecture. First, a layered parallel computation model is proposed to provide theoretical fundamental and analysis approaches for parallel algorithm design and parallel pro-gram execution. Specially, parallel execution model is emphatically studied. Second, in order to improve real application's performance on shared memory system, program performance optimization techniques are studied in depth. The study on computation model and performance optimization would effectively increase application's perfor-mance on parallel computer, which are valuable in theory and practice. Specifically, the main contents and contributions of this dissertation are as follows.(1) Layered parallel computation model With the rapid development of parallel ar-chitecture, the conventional unified parallel computation model becomes more and more complex, which is difficult to use. A layered parallel computation model is proposed in this dissertation, which consists of parallel algorithm design model, parallel programming model and parallel execution model. The properties of each model are presented, as well as the research spots.(2) Optimization of MPI communication on SMP MPI is widely used in parallel pro-gramming, which supports both distributed and shared memory system. But cur- rent MPI communication device on SMP has low performance. In this dissertation, an optimized MPI communication method is proposed for SMP. In the optimized communication, IPC (Inter-process communication) is employed for data transfer, and a spin-waiting strategy is used for synchronization. The new communication method reduces message passing delay, increases the performance of point-to-point communication and collective communication, and delivers high communication performance in real application.(3) Optimization of typical applications on shared memory system In this disserta-tion two typical applications on different shared memory systems are studied. One is parallel optimization of Mfold on SMP using MPI, which is an application in bioinformatics. The other is parallel optimization of content-based image retrieval (CBIR) on CMP using OpenMP, which is an application in information retrieval. Based on the characteristics of the applications and the architectures, high per-formance parallel algorithms are designed to exploit multi-level parallelism of the shared memory systems. The proposed optimization techniques significantly im-prove the ILP (Instruction Level Parallelism), DLP (Data Level Parallelism) and TLP (Thread Level Parallelism), finally accelerate the two applications on shared memory systems, and provide insights into designing high performance applica-tions on these platforms.(4) Quantitative program execution performance model for CMP Based on the re-search of performance optimization on shared memory systems, a quantitative pro-gram execution performance evaluation model for CMP is proposed:CRAM(h). CRAM(h) model extracts the key aspects effecting parallel program performance, such as instruction execution, memory hierarchy and parallelism, uses performance profiler to measure performance parameters, quantifies performance benefits from optimizations, evaluates program performance. Experiments are conducted to eval-uate the correctness and effectiveness.
Keywords/Search Tags:Parallel Computing, Layered Parallel Computation Model, SMP, CMP, Performance Optimization, Performance Evaluation, Program Execution Model
PDF Full Text Request
Related items