Font Size: a A A

The Optimization Of Memory Controller For High Performance CPU

Posted on:2013-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WangFull Text:PDF
GTID:2268330392473887Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The access speed of memory is an important factor for the performance of theprocessor, especially in the multi-core multi-thread processors. The performance of themain memory depends on the memory controller, which determines many importantfactors of the memory, such as the maximum capacity, the number of banks, the type,the speed, the depth and the width of data, and so on. The main goal of our workfocuses on the optimization of the memory controller in X processor. X processor is ahigh performance processor, which has16cores and each core has four strands,multi-thread and SIMD are supported. There are two integer execution pipelines, onevector processing unit, one floating-point execution pipeline and one load/store pipeline.Its enormous data strobe puts high pressure on the memory. And four on-chipdual-channel DRAM controllers are built. To a certain extent, this can weaken thememory pressure, but will make memory address dispersed.This thesis studies the architecture of X processor and DDR3SDRAM deeply.After analyzing the structure of the existing memory controller, I make improvements。In order to increase the program locality, bank parallelism and row locality, a new wayof address translation is designed. To increase the possibility of hitting the same rowand decrease the switching delay between read and write, a two-layer memory scheduleris designed, which schedules the request sequence inside and outside of the bankrespectively, including starvation mechanism. That improves the utilization of memorybandwidth greatly. By increasing the number of active pages, I reduce the delay offrequently opening and closing the active page. To achieve it, a virtual buffer modulebetween the on-chip buffers and the memory controller is designed.The memory controller in X processor is implemented in Verilog and functionverified roundly to ensure the correctness of design. At last, this thesis analyzes theperformance of the optimized and the before optimized architecture in detail. Thebandwidth increases from5.88GB/s to18.55GB/s. The results show the superiority ofthis optimized structure.
Keywords/Search Tags:Memory Controller, Address Translation, Request scheduler, row buffer
PDF Full Text Request
Related items