Font Size: a A A

Research On Influence Of Threads Placement Strategy On Performance And Long-term Fairness Of Hierarchical Locks

Posted on:2020-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:P F ZhaoFull Text:PDF
GTID:2428330620958863Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The emergence of NUMA(Non-Uniform Memory Access)architecture overcomes the scalability limits of SMP(Symmetric Multi-Processor)architecture,enabling more cores integrated on a single machine.While the distributed design of physical memory in NUMA architecture makes memory access suffer from non-uniform latency.More cores enable multithreading applications to use more threads to take advantages of multicore resource and non-uniform memory access latency makes exploiting locality more important to achieve higher performance.Both of these changes in multithreading applications pose new challenges to the design of locks,which tend to become the bottleneck of system performance.Queue-based locks,especially MCS lock1 and CLH lock2,are widely used in lock-intensive high-performance systems due to their good performance on throughput,scalability and fairness.These locks push threads contending for the same lock into a first-in first-out(FIFO)queue by their requesting order.Each thread spins on its own record in the queue and enters critical section by the order in the queue.The FIFO queue guarantees the fairness of queue-based locks,and each thread spinning on a different memory location provides good performance and scalability.On NUMA architecture,in order to make full use of multi-core resources,threads competing for the same lock are inevitably distributed on different NUMA nodes,resulting in inter-node lock transfer,which typically takes much longer time than lock transfer in the same node.The traditional queue-based locks are unaware of the underlying NUMA factor and they need to maintain the FIFO lock acquisition order.Which involves enormous inter-node lock transfers and thus causes performance degradation.To avoid performance degradation of queue-based locks on NUMA architecture,NUMAaware queue-based hierarchical locks(hereinafter referred to as hierarchical locks),such as Cohort locks and HMCS(Hierarchical MCS)locks,change the global FIFO lock transfer order and prefer to transfer the lock to a local waiter in the same node,only if these is no local waiter or the lock acquisitions on local node have exceeded the predefined threshold is the lock transferred to another node.Hierarchical locks reduce the frequency of inter-node lock transfer and thus achieves higher performance under NUMA architecture.Since the high-performance of hierarchical locks depends on the distribution of threads on NUMA nodes,the threads placement strategy has a great impact on the performance improvement that could ultimately be achieved.In existing hierarchical locks,threads are usually placed compactly(Compact Placement)on as few nodes as possible to reduce the frequency of inter-node lock transfers.This strategy helps hierarchical locks exploit locality to achieve higher performance,but the local-preferred lock transfer mechanism of hierarchical locks makes it hard to guarantee long-term fairness.Besides,placing threads evenly on all nodes(Even Placement)theoretically guarantees the longterm fairness of hierarchical locks,but since the thread distribution is more sparse,when the lock contention is not high enough,it could lead to serious performance degradation compared to Compact placement.The existing threads placement strategies in hierarchical locks could not guarantee both high throughput and long-term fairness or could only guarantee both of them when highly contended.While in some scenarios such as public cloud both throughput and fairness are key to success and the lock contention could vary from time to time.In this paper,we present a contention-aware hybrid threads placement framework CAH.In CAH we propose two new thread placement strategies: Compact With Shift(CWS)and Enhanced Even(EE).CWS introduces a lightweight thread shifting mechanism into compact placement to smooth the throughput difference among threads.EE limits the number of nodes even placement could exploit to the minimum number of nodes that could place all threads,instead of all nodes.CAH is essentially a hybrid solution of CWS and EE that dynamically adjusts the thread placement strategy based on the lock contention in the application.When the contention level of the lock is high enough,CAH applies EE strategy,otherwise CWS strategy,so that regardless of the change of lock contention,the throughput and long-term fairness could always be guaranteed.
Keywords/Search Tags:hierarchical locks, threads placement, contention-aware
PDF Full Text Request
Related items