Font Size: a A A

Study On Transactional Memory Programming And Software Library Optimization Of Tera-Flops Computer KD-50-I

Posted on:2009-07-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Q YangFull Text:PDF
GTID:1118360242995764Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of high performance computers, especially the advent of CMP (Chip Multiprocessor) architecture, the parallel computing technology receives more and more attention. However, the popularization of parallel computing is limited by the complexity of writing parallel program and the expensive high performance computers. Therefore our research focus on the Transactional Memory(TM) programming on CMP for simplifying the complexity of parallel programming , and the software optimization of the Tera-Flops high performance computer KD-50-I for the population of high performance machines made in China. The main contribution and innovation of this dissertation can be summarized as follows:1. Study on Transactional Memory Execution Parallel Programming ModelTM (Transactional Memory) programming model applied on the future CMParchitecture is discussed, and the software library framework for writing TM execution program is realized. Through providing a series of programming API, such as begintransaction, end_transaction and abort_transaction and so on, the source code is transactional aware by the intuitive and detail way. Therefore, it can be applied to verify new transaction memory execution algorithms, and provide the inspiration for the new transaction memory hardware design.2. Study on the Extension of OpenMP to Support Transaction Memory ExecutionAlthough OpenMP is the popular multithread programming model on CMP architecture, OpenMP compilers do not check data dependency, memory access confliction and other likely problems causing program error. Traditional lock is applied by programmers to guarantee the correctness of program. It is easy to write coarse-grain lock program, but the parallelism of the program may be lost. On the contrary, the potential parallelism of program can be found by writing fine-grain lock program, but it may bring about unwanted-to-see problems, such as priority inversion, deadlock and so on. The extension of OpenMP to support Transactional Memory can solve the difficult choice between the simplicity and productivity of writing parallel OpenMP program.3. Study on Speeding Sequential Binary Program with Transaction Memory Execution on CMP PlatformLegacy binary code without source code cannot make full use of CMP platform. due to most of them are single-thread sequential program. In order to make its performance improved on CMP, the binary program is reverse compiled and multithreaded. However the idea is difficult to be realized by two main factors. On one hand, it is difficult recover data type and complex control flow from the binary code, so the pseudocode is too obscure to read. On the other hand, the most difficult aspect of parallel multithread programming is to analysis data dependency, and it is more difficult to do it for obscure pseudocode. Combining reverse compilation and the speculative parallel threading based on TM can solve the problems of the incomplete of pseudocode information and the strict limitation of parallel compilers, and more potential parallelism can be found.4. Study on Software Library Optimization of Tera-Flops ComputerKD-50-I KD-50-I has the advantages of high performance, low cost, low power andsmall occupation area etc. As Loongson 2F processors support multi-add instruction and four-launch pipeline, the loop unrolling and instruction schedule technology is applied to improve the parallelism of instruction. In addition, data prefect is applied to reduce memory access cost. Due to the fixed topology of nodes and simple interconnection, LBP communication model is applied to analysis and optimize the communication library. Through the software library optimization, the performance of applications is improved. It is significant for the popularization of KD-50-I high performance computer made in China.5. Study on Parallel Data Mining on Tera-Flops Computer KD-50-IData mining technology based on sequence computer system lags behind the need of processing of the large-scale data and complexity calculation. The development of network and high performance computing technology makes it possible for the above problem solved by parallel data mining. With the background of financial risk management, the parallel data mining algorithm is optimized according to the specific character of the computing node and the network topology of KD-50-I, and the performance is improved. It is helpful for parallel algorithms design and design of other applications on KD-50-I.
Keywords/Search Tags:Transactional Memory, Parallel Programming Model, Chip Multiprocessor, Tera-Flops Computer
PDF Full Text Request
Related items