Font Size: a A A

myCACTI: A new cache design tool for pipelined nanometer caches

Posted on:2007-05-31Degree:Ph.DType:Dissertation
University:University of Maryland, College ParkCandidate:Rodriguez, Samuel VerzolaFull Text:PDF
GTID:1458390005990255Subject:Engineering
Abstract/Summary:
The presence of caches in microprocessors has always been one of the most important techniques in bridging the memory wall, or the speed gap between the microprocessor and main memory. This importance is continuously increasing especially as we enter the regime of nanometer process technologies (i.e. 90nm and below), as industry has favored investing a larger and larger fraction of a chip's transistor budget to improving the on-chip cache. This is the case in practice, as it has proven to be an efficient way to utilize the increasing number of transistors available with each succeeding technology. Consequently, it becomes even more important to have cache design tools that give accurate representations of designs that exist in actual microprocessors.;The prevalent cache design tools that are the most widely used in academe are CACTI [Wilton1996] and eCACTI [Mamidipaka2004], and these have proven to be very useful tools not just for cache designers, but also for computer architects. This dissertation will show that both CACTI and eCACTI still contain major limitations and even flaws in their design, making them unsuitable for use in very-deep submicron and nanometer caches, especially pipelined designs. These limitations and flaws will be discussed in detail.;This dissertation then introduces a new tool, called myCACTI, that addresses all these limitations and, in addition, introduces major enhancements to the simulation framework. Some of the major enhancements are briefly described as follows: (1) Use of SPICE BSIM4.0 equations to accurately characterize device behavior for nanometer process technologies. In contrast, CACTI and, to a major extent, eCACTI simply use hardcoded parameters derived for an obsolete 0.80mum process technology. (2) The modeling of a typical explicitly-pipelined cache, which accounts for all the overhead in pipelining that will be present in virtually all industry-level microprocessor caches. In contrast, CACTI and eCACTI model wave-pipelined cache, something that is not representative of commercial designs. (3) Inclusion of more optimal variable stage dynamic logic circuits for the decode hierarchy that provides the tool more flexibility in finding optimal implementations. In contrast, both CACTI and eCACTI model a fixed-stage static CMOS decode hierarchy, significantly limiting the optimization search. (4) Inclusion of an accurate model and per-process numbers for a typical BEOL-stack that are representative of nanometer processes. The significance of this is made even more important given the tremendous effect of interconnect parasitics on a cache's behavior. (5) Inclusion of a gate leakage tunneling current model for improved handling of static power dissipation. (6) Inclusion of a very realistic interconnect model that is representative of the interconnects in a real nanometer cache. In contrast, both CACTI and eCACTI have an unrealistic model of the interconnect as they assume the use of interconnect with a single characteristic no matter where it is located and used in the cache.;This dissertation then demonstrates the use of myCACTI in the cache design process. Detailed design space explorations are done on multiple cache configurations to produce pareto optimal curves of the caches to show optimal implementations. Detailed studies are also performed to characterize the delay and power dissipation of different cache configurations and implementations. Some of the more important observations, among the many that were found, are as follows: (1) The pipeline power dissipation overhead is very significant and it typically dominates the total power. (2) Interesting non-monotonic behavior with respect to delay and power dissipation for caches with different associativities exist, such that we can conclude that some optimal implementations are definitely superior than other optimal implementations In other words, overlapping pareto optimal curves result in some optimal points being reconsidered as optimal. (3) The power dissipation due to gate leakage tunneling current is surprisingly not as significant as initially expected.;Finally, future directions to the development of myCACTI are identified to show possible ways that the tool can be improved in such a way as to allow even more different kinds of studies to be performed.
Keywords/Search Tags:Cache, CACTI, Tool, Nanometer, Mycacti, Power dissipation, Optimal implementations, Process
Related items