Font Size: a A A

Analysis Of Cache Misses In Chip Multiprocessor Using Simics

Posted on:2011-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:Mohamed Boukhary B H LFull Text:PDF
GTID:2178360308469486Subject:Computer Science
Abstract/Summary:PDF Full Text Request
Chip multiprocessors (CMPs) combine multiple processors on a single die, typically with private level-one caches and a shared level-two cache. However, the increasing number of processors cores on a single chip increases the demand on two critical resources:the shared L2 cache capacity and the off-chip pin bandwidth. Demand on these critical resources is further exacerbated by latency-hiding techniques such as hardware prefetching. Processor speed has been increasing at a much greater rate than memory speed leading to the so called processor-memory gap. In order to compensate for this gap in performance, modern computers rely heavily on a hierarchical memory organization with a small amount of fast memory called cache. The true cost of memory access is hidden, provided data can be obtained from cache. Substantial performance improvement in the runtime of a program can be obtained by making intelligent algorithmic choices that better utilize cache. Previous work has largely concentrated on improving memory performance through better cache design and compiler techniques for generating code with better locality. Generally these improvements have been measured by using collections of benchmark programs, simulations and statistical methods. In contrast in this work investigates how the choice of algorithm affects cache performance.The primary goal of this thesis is to help the user to easily detect appropriate cache parameters such as cache size, line size, set associativity, and write strategy or replacement strategy. It also aims at providing the programmers needed information for performing manual code optimisation. For this an analysis model is designed and implemented. The analysis is based on the cache event trace acquired from the SIMICS simulation environment. This model classifies cache misses and delivers statistics on them. This information can be directly applied to optimise cache parameters on multiprocessor systems. In addition, an interface to an existing visualisation tool is built, enabling a graphical representation of analysis results. Information in this form allows the user to detect the optimization target.
Keywords/Search Tags:Multiprocessor
PDF Full Text Request
Related items