Font Size: a A A

Design And Performance Evaluation Of Scalable Transactional Memory Architecture Supporting Speculative Parallelization

Posted on:2010-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:R GuoFull Text:PDF
GTID:2178360302959877Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As a main stream design choice, Chip Multi-Processor (CMP) relies on multi-threaded applications to make full use of the massive computing resource. But tradi-tionalparallelprogrammingmethodologyencountersdifficultyinbothproductivityandcorrectness issues, which seriously restrict the utilization of CMP computation power.Transactional Memory (TM) and Thread-Level Speculation (TLS) techniques are pro-posed to address these problems, targeting synchronization problem in explicit paral-lelization and parallelizing sequential applications, respectively. These techniques bothgain performance from aggressive parallel execution , and leave the correctness issueto hardware runtime conflict detection, which simplifies programming a lot.Existing researches only focus on one of the techniques respectively, and tend togain limited performance at the price of complicated hardware design– especially thoseexisting TLS proposals, which tend to employ closely-coupled design and complicatedbuffering mechanism. This dissertation tries to make an abstraction from the semanticsof both techniques, design a scalable and realistic solution to provide efficient hardwaresupport to both speculative and manual parallelization. The rich semantics provided bythis solution will greatly simplifies parallelization.This dissertation focuses on providing unified hardware support to both TM&TLStechniques. The detailed work includes the following aspects: First, a scalable abstracthardware model called LogSPoTM is proposed to provide uniform support to both TM& TLS. A simulator, which provides a realistic implementation to the model, and cor-responding supporting library are also provided to form a complete evaluation system.The objective of simplifying parallelization is achieved since parallelizing programs onthis system only needs to tweak the code a little. Second, a set of representative bench-marks that carry different memory access patterns is chosen to evaluate various factorsand design choices that might have impact on the performance of LogSPoTM imple-mentation. Evaluation results are carefully analyzed to identify key factors. Finally, toaddress the obstruction imposed by slow simulating speed, a FPGA-based LogSPoTMsimulation environment is designed under the HAsim hardware simulation infrastruc-ture. The simulation speed can be improved by magnitudes of two to three. Differingfrom normal hardware prototyping design, this will be a highly configurable and ob-servable research simulator. We have drawn some conclusions of the speculative parallel threading techniqueitself during the process of the implementation and evaluation to the LogSPoTM ar-chitecture. First, more efforts should be devoted to the software optimization, insteadof complicated hardware design. Recognizing application characteristics such as de-pendency & memory access pattern and iteration granularity under compiler supportshould help a lot. Second, it is impractical to improve the performance of all appli-cations through automatic parallelization using speculation, however this speculativemulti-threading technique can be regarded as anassistant tool for the sophis ticated man-ual parallelization.
Keywords/Search Tags:scalable CMP architecture, transactional memory, speculative parallelization, FPGA based hardware simulation
PDF Full Text Request
Related items