Font Size: a A A

The Characteristic Analysis And Runtime Optimization Of Lock Synchronizations In The Multithreaded Programs

Posted on:2017-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ZhengFull Text:PDF
GTID:1318330485450831Subject:Computer architecture
Abstract/Summary:PDF Full Text Request
With the advent of multicore/manycore processors, the power of computer has been greatly enhanced. To fully exploit their computing power, the multi-threaded programming languages are born at the right moment, but also introduce the problem of inter-thread communication. As a consequence, the lock synchronization mechanism is proposed. In the field of parallel programming languages, the core idea of lock synchronization mechanism is to guarantee the exclusive access of the multiple threads to the same shared resource. Due to the randomness and complexity of runtime scheduling of program, in the dynamic program execution, lock synchronization mechanism introduces a large number of non-conflicting exclusive accesses as well, i.e., unnecessary lock synchronizations in which no same shared resource is accessed by the critical sections under the protection of the same lock. The unnecessary lock synchronization formed in the dynamic program execution leads to many negative effects:1) in the aspect of performance impact, the critical sections of the unnecessary lock synchronizations originally could be performed in parallel because no shared same shared resource is accessed. However, in practice they are serialized due to the lock protection. As a result, the unnecessary lock synchronizations with the huge amount, in general, degrade the performance of program severely.2) In the aspect of race detection, existing mainstream race detection techniques stem from Happens-Before (HB) relation, which usually forms the partial ordering between unlock and lock events across threads. If any two events have no partial ordering between them, they constitute a race conditon. However, on account of the parallelization characteristics of unnecessary lock synchronization, the strong edge ordering of HB relation in general leads to the false negatives. Besides, the unnecessary lock synchronizations have the exponential possibility of interleavings in amount. As a consequence, it is not an easy job to efficiently identify unnecessary lock synchronizations. Around the probems above, we carry out the project regarding "The characteristic analysis and runtime optimization of lock synchronizations in the multi-threaded programs", which consists of the following three aspects:For the characteristic study, we first sysmatically analyze the runtime characteristic of lock synchronizations, especially for the unnecessary lock synchronizations. To be specific, taking some real-world programs (e.g., OpenLDAP, mysql, pbzip2, transmissionBT, handbrake) as benchmarks, we collect the lock synchronizations among them. Based on the observations, we have made the summarization regarding lock synchronizations, including their classification, manifestations, root causes, systematic impact, prevention strategies as well as potential fixing strategies. Furthermore, our characteristic study reveals 11 new observations and implications that have the great signification for the in-depth comprehension on the unnecessary lock synchronizations.For the performance debugging, we propose a replay-based performance debugging approach-PerfPlay. The main idea of PerfPlay is as follows:First, we record the original ULCP trace which is then transformed into a ULCP-free one by the means of topological graph techniques. Afterwards, through replaying two traces, we obtain the corresponding performance results. Finally, the performance impact of unnecessary lock synchronizations can be quantified through the comparison of two replayed performance results. The experimental results demonstrate that:1) PerfPlay guarantees the performance fidelity of replay execution with the high performance stability and performance precision; 2) With the low (<4.3%) runtime lockset overhead, PerfPlay recommends the unnecessary lock synchronization related code sites to the programmers with the high optimizable value; 3) the case studies further reveal the efficacy of PerfPlay as well.For the race detection, we propose a relaxed Happens Before relation, also called ULCP-HB relation which can break through the partial ordering between unlock and lock of unnecessary lock synchronizations enfored by the strong edge constraint of HB relation. To implement the ULCP-HB relation, combining with the runtime characteristics of unnecessary lock synchronizations, we further propose a lightweight two-step detection approach with the online heuristic analysis and offline reordering analysis. Our approach can exploit the data races hidden behind unnecessary lock synchronizations without introducing negligible runtime overhead. The experimental results also show that, compared to the HB-based technique, our approach exploits 19.8% more data races.51.0% of execution time and 52.3% of trace size can also be saved through a quick filtering of the simple pattern with a negligible (<4.45%) performance overhead introduced on-the-fly.Overall, around unnecessary lock synchronizations, our project makes an in-depth, comprehensive study from two aspects:one basic research (i.e. runtime characteristic study) and two extending studies (i.e. performance debugging and data race detection). These researches strengthen the understanding on the program behaviors and systematic impact of unnecessary lock synchronizations and further assist the programmers to fix the program impact of unnecesarry lock synchronziations effectively.
Keywords/Search Tags:Multithreaded Program, Characteristic Study, Performance Debugging, Data Race, Unnecessary Lock Synchronizations
PDF Full Text Request
Related items