Font Size: a A A

Hardware Transactional Memory Based On The Decoupling Design Philosophy

Posted on:2010-02-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:S G WangFull Text:PDF
GTID:1118360278956557Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Currently, we are facing new challenges to get sustained performance improvement for parallel workloads as the underlying hardware platforms have entered the multi-cores era. The processor's design space has shifted from increasing clock frequency to adding more on-chip cores, which brings large amount of hardware parallelism for the software. Yet, the counterpart software technique which can efficiently and easily utilize the hardware resources is far from being satisfied.Transactional Memory (TM) is a new shared resources synchronization model which is easy to program, deadlock-free and more scalable. It has been a promising technique to replace the notorious locks which are widely used in current parallel programs.Recent years, there has been a lot of research on providing hardware support for transactional memory systems. Compared with the software approach, the hardware approach provides better performance advantages, but there are some challenges that are still not well resolved by current proposals. Two notable challenges are hardware support of unbounded transaction and handling of operating system (OS) events. This paper's work mainly targets on the hardware approach and brings a new HTM system, named DTM (Decoupled Hardware Transactional Memory) to efficiently resolve these two challenges.HTM system mainly involves two tasks: version management and conflict detection. Almost all of current HTM proposals rely heavily on cache system to achieve one or both of these tasks. this paper sees that neither version management nor conflict detection scheme that relies on traditional cache system can be easily extended for efficient support of unbounded transaction and operating system. Current proposed approach will either bring very complex design or great execution overhead.Based on these observations, this paper advocates the following thesis: current HTM design approach which heavily relies on traditional cache system has possibly inborn mismatch with transactional memory, which leads to untractable design complexity or great performance penalty in providing support for unbounded transaction and operating system. Our field should make an attempt at the approach which fully decouples transaction processing from traditional cache system to resolve these two challenges.DTM reaps the following benefits: (1) full hardware speed execution of unbounded transactions without sacrificing concurrency, almost no false conflict and low resources overhead; (2) hardware provides smooth operating system support, including unrestricted context switch/migration and virtual memory paging. OS handles this events with few extra operations than what are required for the un-transactional thread case; (3) with a clear separation of module function to fully decouple the implementation of transaction processing module from traditional cache system, our system can be easily implemented. The transaction module is seamlessly incorporated into, but orthogonal with other system parts; (4) effciently support advanced TM semantcis, such as strong isolation, nested transactions, irreversible mode, partial abort etc.The second part of this paper's work is to proposes an efficient and unbounded hybrid-mode TM system with strong isolation guarantee, called HybridT-Cache. HybridTCache optimizes the common case by executing small transactions completely by hardware, and triggers operating system (OS) support with low overhead for the uncommon case when transaction size exceeds the hardware capacity.HybridTCache adds a new L1 cache, named TCache, to buffer transactional data for the active transaction executed by the processor. Compared with traditional log based approach, TCache provides fast bookkeeping which eliminates software logging overhead for the un-overflowed blocks, thus making both transaction commit and abort fast. A key design point of hardware TM is to support unbounded transactions. HybridTCache achieves this by introducing TCache overflow exceptions and resorting to OS to handle the overflowed blocks.The evaluation of HybridTCache is performed using GEMS simulator with 16 sparc processors. Running the splash2 benchmark suit shows that performance of transactional processing increases significantly compared with traditional log based transactional memory system. Transaction is usually executed by current TM systems following the abort-retry execution way. This paper studied the common characteristics of this execution model, and find that some work that has performed by the aborted transaction can be efficiently reused to boost the re-execution speed.Based on this idea, this paper proposes a new general transaction iteration's data reusing (TItDR) method which reuses the opened object of failed transaction in the following re-execution. The obvious advantage is that it greatly simplify the opening process if it has been opened in previous failed transaction and most of the cleanup work are no longer needed. TItDR leaves opened object in pseudo-active state and restart the transaction, this paper talks about conflicts resolution, validation, commit/abort processing problem along with our data reusing method and show that TItDR will not incur more conflicts and more overhead for validation or commit. Both currently proposed software transactional memory (STM) systems and hardware systems (HTM) have much potential data reusing. Our test result is based on STM implementation, which shows good performance improvement on average.
Keywords/Search Tags:Hardware transactional memory system, decoupled design philosophy, DTM, HybridTCache, data reusing, TItDR
PDF Full Text Request
Related items