Font Size: a A A

Research On Key Technologies Of CC-NUMA Based Memory Architecture

Posted on:2008-07-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:G T PanFull Text:PDF
GTID:1118360242499350Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Distributed Shared Memory(DSM) system provides a global shared address space, which trades off shared memory multi-processor and distribute memory system.With the advantages of programmability and scalability,DSM has become the preferred hardware platform for massive parallel high performance computer systems. CC-NUMA is an effective mechanism to improve the performance of DSM systems. The maintenance of cache coherence,which not only determines system correctness, but also greatly impacts system performance,has been the primary difficulty to implement CC-NUMA systems.Currently researches focus on the scalable and high performance implementation of directory-based cache coherence system.Processors in CC-NUMA systems communicate with each other through shared memory,so latency of remote memory access,especially with great number of processors,will dramatically impact the system performance.The key of effective implementation of CC-NUMA systems lies on improving the memory bandwidth, shortening memory access latency and reducing the gap between local and remote memory access latency.This dissertation is devoted to the implementation of effective CC-NUMA systems memory architecture.It researches the scalability of directory-based cache coherence, the optimization of directory protocols,the simulation and verification platform for CC-NUMA systems,and the technology of improving memory bandwidth and reducing access latency.The main work and contributions of the dissertation are as follows:1.A new scalable CC-NUMA architecture based on SMP nodes,called SCDSM,is proposed.A lock-free,high performance directory-based cache coherence protocol is implemented based on SCDSM.A FWB strategy is proposed to address the inconsistent problem between cache state and directory state when read request hits dirty cache block on the bus of SMP node.The strategy solves the difficult problem of compatibility of directory protocols and snooping protocols.A LMRDF strategy is proposed to decrease request sending delay caused by waiting the hit result on bus in CC-NUMA system based on SMP node.This technique improves the performance of SCDSM system by 10%-15%.2.A Markov Chains model is built for the distribution of shared data in CC-NUMA systems.We analyze the distributing pattern of shared data based on this model.It is proved that,the average number of cache copies of shared data is small in CC-NUMA systems.This theoretical analysis of distributing pattern for shared data in CC-NUMA systems can be helpful in proposing more effective directory organization methods.3.A two-level directory organization scheme based on directory cache is proposed to address the problem of directory memory overhead prohibiting the scalability of cache coherence protocol.This scheme can reduce the memory overhead of directory information and improve the scalability of the protocol.Simulation and analysis showed that the execution times of a number of parallel benchmarks were shortened to various degrees.4.Memory wall is the bottleneck of system performance.To reduce memory access latency is the challenge we have to face.Four memory scheduling algorithms with different constraint degrees are presented,the simulation and analysis showed that the greedy memory scheduling algorithm with conflict elimination of bank address and starvation avoidance strategy is effective.The DDR2 based memory controller is implemented on hardware.5.A distributed multi-node simulation and verify platform named CoSim is proposed to effectively verify the correctness of complex or large systems.To assist simulation tests and verification of cache coherence protocol,the CMCV model is proposed.A QSCV model similar to stream copy with Verilog hardware description language is built to evaluate the LMRF technical and the memory bandwidth of SCDSM system.In summary,the dissertation provides a feasible solution for a number of challenging problems of CC-NUMA systems,and these solutions have been implemented in engneering.It is believed that the research will make a nice groundwork for the further research and engineering on CC-NUMA based memory architecture.
Keywords/Search Tags:CC-NUMA, memory access latency, cache coherence protocol, directory organization, memory overhead, scalability, memory scheduling algorithm, simulation and verification
PDF Full Text Request
Related items