Font Size: a A A

Techniques To Tackle Memory Interference In Multicore Systems

Posted on:2018-05-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:D L XiongFull Text:PDF
GTID:1318330542488601Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
In multicore systems,main memory is typically a critical shared resource.Multiple applica-tions executing concurrently contend for the limited memory bandwidth and capacity,and interfere with each other.The inter-application memory interference includes the broken row-buffer local-ity,the broken bank-level parallelism,the contention for shared address and data buses and so on.If inter-application interference is not properly managed,can result in low system performance and fairness,unpredictable application slowdowns,and even starvation for some applications.Building QoS and application awareness into different components of the memory system,such as memory controllers,caches,interconnects and so on,is important to control interference at these different components and mitigate or even eliminate unfairness and unpredictability.To this end,prior work tackles the memory interference problem from two different solution directions:1)to mitigate interference,thereby reducing application slowdowns and improving overall sys-tem performance,2)to precisely quantify and control the impact of interference on application slowdowns,thereby providing performance guarantees to specific applications.Application-aware memory access scheduling is a prevalent way to mitigate memory inter-ference.With the goal of high system performance and fairness,most of them monitor the ap-plications' memory access characteristics,rank applications in a total order based on these char-acteristics and prioritize the vulnerable-to-interference applications,therefore mitigating memory interference.But ranking applications in a total order incurs high hardware complexity,and the critical path latency is longer than the minimum scheduling time of start-of-the-art DRAM proto-cols.Moreover,the applications at the bottom of ranking stack are unfairly slowed down.Few application-aware memory access scheduling dynamically separates applications into two groups through simple schemes,and prioritizes the applications in the vulnerable-to-interference group.They can achieve low hardware complexity and cost,but they also have bad system performance and fairness.This thesis proposes a memory access scheduling based on dynamic multi-level priority,called DMPS,aiming to balance system performance,fairness and hardware complexity.First,DMPS uses MPKC-based metric "memory occupancy" to measure interference,this metric can be implemented simply in hardware,and reflect effectively the interference caused by applications.Second,DMPS maps applications into multiple priority levels,it not only separates effectively applications with different behaviors,but also keeps the hardware complexity low.Third,the priority of an application turns low as the number of requests served increases,ensuring that the applications with similar characteristics stay at the same priority level and share the memory band-width fairly.The evaluation results show that DMPS can keep the advantages of prior memory schedulers and hide their disadvantages,providing a good tradeoff among system performance,fairness and hardware complexity.Online slowdown estimation model is a prevalent way to quantify and control inter-application interference,the key challenge is how to accurately estimate the slowdowns of applications in the presence of memory and cache interference.Prior work has two definitions for slowdown:1)the ratio of interfered to uninterfered stall/execution times,and 2)the ratio of uninterfered to in-terfered request access/service rate.The former needs to estimate the accurate cycles by which each request is delayed due to interference,but service of different requests overlap significantly due to the abundant parallelism in the memory system,thereby resulting in high inaccuracies in the slowdown estimation.The latter assigns the estimated application highest priority to access main memory,minimizing memory interference received by the application.They quantify the interfered cycles in an aggregate manner for a large set of requests,improving estimation accuracy significantly.But request access/service rate can't represent the performance of compute-intensive applications well,and the model to quantify interfered cycles is too simple.This thesis proposes an online slowdown estimation model to quantify and control memory interference,called SEM.First,SEM uses IPC to present the performance of applications,and defines the slowdown as the ratio of uninterfered to interfered IPC.This not only avoids the clas-sification of applications,unifies the equation to compute applications' slowdowns,but also im-proves estimation accuracy due to the fine granularity of IPC.Second,SEM assigns the estimated application the highest priority to access main memory,minimizing the interference it received.SEM uses a DRAM-like structure to monitor interference,and considers write interference,row-buffer interference and data bus interference.Third,SEM considers bank-level parallelism when quantifying interference,similar to the real situation.The evaluation results show that SEM can achieve better estimation accuracy,no matter shared cache is enabled or not.
Keywords/Search Tags:Memory interference, memory access scheduling, slowdown estimation model, predictable performance, hardware complexity, bandwidth allocation
PDF Full Text Request
Related items