Font Size: a A A

Multiple clock domain microarchitecture design and analysis

Posted on:2004-08-16Degree:Ph.DType:Dissertation
University:The University of RochesterCandidate:Semeraro, Greg PhilipFull Text:PDF
GTID:1458390011457800Subject:Engineering
Abstract/Summary:
As clock frequency increases and feature size decreases, clock distribution and skew tolerance present growing challenges to the designers of singly-clocked, globally synchronous processors. We describe a globally-asynchronous, locally-synchronous (GALS) approach, which we call a Multiple Clock Domain ( MCD) processor, in which the chip is divided into several clock domains, within which independent voltage and frequency scaling can be performed. Boundaries between domains are chosen to exploit existing queues, thereby minimizing inter-domain synchronization costs. We propose four clock domains corresponding to the front end (including L1 instruction cache), integer units, floating-point units, and load-store units (including L1 data cache and unified L2 cache).; In addition, we quantify the potential energy savings of a specific MCD processor based on the Alpha 21264 microprocessor using off-line analysis of traces of a broad range of applications to identify the potential energy savings. With the results from this off-line algorithm as a benchmark, we describe the design, analysis and performance of a realistic on-line frequency/voltage control algorithm which achieves on average a 19.0% reduction in Energy Per Instruction (EPI), a 3.2% increase in Cycles Per Instruction (CPI), and a 16.7% improvement in the Energy-Delay product, with a Power Savings to Performance Degradation ratio of 4.6. This Energy-Delay product improvement is 85.5% of what was achieved using the off-line algorithm. All of our results (from both the off-line and online algorithms) were achieved using a broad mix of compute bound, memory bound, and rate-based applications from the MediaBench, Olden, and Spec2000 benchmark suites.; We also demonstrate that the inherent characteristics of an MCD microarchitecture allow internal processor complexity to be dynamically traded for frequency on a per-domain basis. Simply configuring the MCD processor once per application increases performance 17.6%, on average, compared to the best fully synchronous design. When adapting to application phases, performance improves by 20.4%.; These techniques provide an enabling technology which will allow future processor designs to achieve higher levels of scalability, performance, and energy efficiency than would otherwise be possible with a monolithic synchronous processor.
Keywords/Search Tags:Clock, Processor, Performance, Energy, MCD
Related items