Font Size: a A A

Research On Parallel Model And Compiler Optimization Technique Based On Multi-core

Posted on:2012-12-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:H F GuoFull Text:PDF
GTID:1118330371462496Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The rapid development of multi-core architectures leads to chances as well as considerable challenges. The performance of sequential applications we are familiar with cann't be improved on multi-core systems. Only parallel applications or the parallelized applications can take full use of the abundant computing resources provided by multi-core. The parallelization tools in compiler can solve the problems of some programs. For irregular parallelism, it is necessary for programmers to use proper parallel models and languages. Most current explicit parallel models are good at dealing with coarse-grained parallelism, while there is little support for fine-grained parallelism and synchronization. In addition, few of current parallel programming tool implements classical data-flow analysis and optimization. Actually, the analysis and optimization for parallel programs are already a challenge. The undetermined runtime status of parallel programs makes analysis less accurate and optimization for parallel programs difficult. So to exploit fined-grain parallelism, analyse and optimize parallel programs have become a research focus in the field of multi-core compiler technique.OpenMP has been a de facto standard API for shared-memory architectures. As the multi-core spreads, more people use it to program for multi-core. This dissertation mainly researches data-flow analysis and optimization techniques for OpenMP programs as well as the implementations of fine-grained parallelism on multi-core architecture. The main contributions are as follows:1. This dissertation presents a method to create parallel control-flow graph for OpenMP programs called OMPCFG which incorporates the memory consistency model of OpenMP. It has less conflict edges than other CFG proposed before. By means of definition of variables set to flush, the possibility of undetermination is reduced as ayalysis on the OMPCFG. Accordingly, the reaching-definitions analysis is more accurate on OMPCFG.2. The parallel static single assignment (SSA) form of OpenMP program is created based on OMPCFG. Four kinds of optimization on parallel SSA for OpenMP are implemented. They are copy propagation, dead code elimination, sparse conditional constant propagation, and loop invariant code motion. Cases show these optimizations can do well, while these cases cann't be optimized or may be done wrong with general compilers. In addition, these analysis and optimization techniques of parallel programs provide opportunities to exploiting fined-grain parallelism.3. This dissertation presents a source-code-level implementation of pipleline parallelism model for iterative algorithm which is often used in engineering technology, and also proposes a synchronization method which utilizes cyclic queues for thread-level pipelined parallelism model. Experiences show it reduces the program excuting time compared to space partition parallel model and another synchronization model.4. Three kinds of implementations of thread-level speculative (TLS) parallelism are dicussed and researched. This dissertation presents the framework of two critical techniques for implementing TLS parallelism in OpenMP. 5. A new selective replication policy and a coherence protocol for specific blend Cache architecture are proposed which integrates the merits of both directory-based protocol and snooping protocol. It takes use of the feature of a lot of proximity communication to reduce the cost of cache coherence. Experiences show that it improved Cache performance and program executing performance compared to directory-based protocol.
Keywords/Search Tags:parallel control-flow graph, parallel data-flow analysis, parallel static single-assignment, thread-level pipelined parallelism, thread-level speculative parallelism, blend architecture cache
PDF Full Text Request
Related items