Limited by the power budget, the freuency of superscalar processors stops to in-crease, which in turn limits serial execution performance. As a result, the explorationof thread-level parallelism has become increasingly important. Currently, most modernprocessors employ muti-core and muti-thread architecture, resulting in the increasinglyhigh chip level parallelism. However, parallel programming model and concurrency con-trol model has not paced with the development of microprocessor architectures, parallelprogramming is still a great challenge for programmers.Transactional memory brings new chances for the development of parallel program-ming model and concurrency control model. It borrows the idea of transaction fromdatabase and use transactions to handle concurrent accesses to shared resources. Theunderlying transactional memory system ensures the atomicy and isolation of the trans-actions. The parallel programming model based on transactional memory has a lot ofadvantages over the lock-based approach. It is lock free, composable and easy to use.Recently, transactional memory has drawn a wide range of attentions of the researchersand becomes a hotspot in computer science.This paper focus on the performance of transactional memory systems and exploresthe important factors which affects the parallelism of transactions.Firstly, this paper categorizes the parallelism of transactions into three level: theinter-threadtransactionalparallelism, thein-threadtransactionalparallelism, andthenest-ed transactional parallelism. For each level of parallelization, this paper formalizes theexecution of transactions and search for the condition for it. After formalized analysis,this paper concludes that the condition is the conflict graph contains no circle.Secondly, this paper proposes the conflict graph based transactional memory systemdesign. It supports both inter-thread and in-thread transactional parallelism, making it alevel 2 transactional memory system. It keeps track of the conflict graph of the activetransactions dynamically. And when a transaction tries to commit, it validate the transac-tionbydetectioncircleintheconflictgraph. Comparedwithpreviouslyproposedtransac-tional memory designs, the conflict graph based transactional memory system maximizethe parallelism of the transactions, which further fully utilizing the parallel processingpower of multi-core processors. Thirdly, this paper choose three typical real world applications, the real-time facerecognition,thepush-relabelmethodforthemaxflowproblem,andtheAdaboostlearningfor face detection, and parallelize them taking the advantage of transactional memory. Ontheonehand,wetrytofindouttheproperwaytoparallelizerealworldapplicationthroughtransactionalmemory,ontheotherhand,weevaluatetheimplementationsoftransactionalmemory systems through practical workloads.Lastly, this paper proposes a analytical performance model for hardware transac-tional memory, which is based on discrete Markov train. The execution process of atransaction is represented by an absorbing Markov train and fixed-point iteration is usedto solve the model. Then the main performance measures, such as the mean number oftransactionrestartsandtheaveragelengthoftransactionscanbecomputed. Theanalyticalmodel is validated against a hardware transactional memory simulator, which shows thatit can predict the performance trends precisely. Moreover, three types of hardware trans-actional memory implementations are compared through the analytical model, includingthe one with eager conflict management, the one with lazy conflict management and theone with mixed conflict management. The results show that hardware transactional mem-ory systems with mixed conflict management present better performance and scalability.Moreover, our analysis show that the processor cores in a transactional memory systemis not likely to scale well beyond 230.
|