Font Size: a A A

Modeling and analysis of x86-based front-end architectures

Posted on:2003-10-07Degree:Ph.DType:Dissertation
University:University of MichiganCandidate:Vlaovic, Steven AlexanderFull Text:PDF
GTID:1462390011984841Subject:Engineering
Abstract/Summary:
This dissertation analyzes x86 processor models in order to better understand the impact that the x86 instruction set architecture (ISA) has on the front end of high performance x86 processors. In order to design better processors, it is important to understand existing processors. Real-world parameters often dictate that a specific ISA must be supported, due to the amount of legacy software available for that ISA. This is the hallmark of modern platforms that support the x86 ISA. For more than 20 years, this architecture has been in existence and architects have had to design for both improved performance and backward compatibility.; To aid in our evaluation of x86 processors, we have developed an x86 simulation infrastructure, called T&barbelow;race A&barbelow;nalysis for X&barbelow;86 I&barbelow;nterpretation, or TAXI, which provides a platform for modeling x86 processors. Not only does TAXI provide detailed performance results, it also has special facilities to do basic branch prediction, instruction cache, and Branch Target Buffer (BTB) studies without running the entire cycle accurate processor model. For our BTB studies, we developed a specialized branch filtering mechanism, DLL filtering, that reduced contention in the BTB by removing Dynamically Linked Library calls from the BTB. Using the TAXI performance model, we have been able to evaluate and improve the performance of two processor models, Model III and Model 4, patterned after the Pentium III and the Pentium 4, respectively. In our analysis of Model III, we decompose front end performance into seven orthogonal components. By improving the branch predictor and the instruction cache, we obtained 20% higher performance across our desktop application suite.; We use TAXI to compare Model III with Model 4 across our application suite. Model 4 has a pipeline that is almost three times as long, yet requires only 30% more cycles per instruction (CPI) than Model III. By developing a new classification for Model 4 trace cache accesses, we discovered that nonhead misses caused the greatest performance loss. We have proposed nonhead miss speculation, a hardware technique, which achieves about 10% speedup over our eight desktop applications by removing most of the penalty associated with nonhead misses.
Keywords/Search Tags:Model, X86, ISA, BTB, TAXI, Instruction, Performance
Related items