Font Size: a A A

Research On Deterministic Replay Recording Techniques For Multicore Systems

Posted on:2019-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z W JiFull Text:PDF
GTID:2428330599477707Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the development of the processor,the rapid increase in processor performance depends on the rapid increase in the frequency of a single processor core,but now due to various reasons,the increase in frequency has encountered a bottleneck,so the on-chip multi-core processor rises in life.Can be seen everywhere,but to fully exploit the computational efficiency of on-chip multi-core processors,there are still many problems that need to be solved.The multi-core environment,the parallel program execution uncertainty is the main factor hindering the development of multi-core processors on the chip,that is,the same parallel program,even if the same input,execution from the same restore point,the execution results may still be different Among them,the multi-threaded competition for shared memory is the most important factor causing uncertainty,which causes great difficulties for the debugging of parallel programs.In the study of various research institutions today,deterministic replay is an effective method to solve the problem of parallel program execution uncertainties,including two phases: record phase and replay phase,in which the resource consumption of the memory competition record log in the record phase determines The performance of deterministic replay methods.Therefore,this paper starts from the space consumption of the competition recording log and studies the low-consumption memory competition recording method.In this paper,four methods are proposed to reduce the size of memory contention log records from two aspects,and a segmented cross-memory contention record method is designed.First,the current instruction count value is used to record the memory contention dependence.When the memory contention occurs,the current instruction is used by the thread to replace the instruction count value of both parties of the competition to represent the memory contention dependency;secondly,the memory is used to implement memory segmentation.The transitivity of the competition record is reduced,and the redundant memory competition that can be derived from the existing memory competition by comparing the instruction segment reductions reduces the number of memory competition records to a great extent;then the adjacent and same The memory competition record is cross-recorded,ie,the conflict direction of eachrecord in the record log and all adjacent records is not the same,further reducing the number of memory contention records;finally,the memory competition is recorded by the segmentation instruction counting method,and the instruction is counted.Perform segmentation to reduce the maximum instruction count and reduce hardware consumption.This paper effectively reduces the size of the memory contention record log and reduces the hardware resource consumption by reducing the number of memory contention records and the space occupied by a single memory contention record.In order to verify the performance of the memory competition record in the above method,the corresponding hardware architecture environment was set up by the gem5multi-core processor simulation platform,and the classical multi-core test program SPLASH2 was used to verify and analyze the method proposed in this paper,and in the same environment,from the The same restore point runs the traditional memory competition recording method RTR and FDR.The comparison results show that the method proposed in this paper effectively reduces the size of the memory contention record log based on the addition of few hardware resources.
Keywords/Search Tags:on-chip multi-core, parallel program, deterministic record, gem5
PDF Full Text Request
Related items