Font Size: a A A

Research And Implementation Of The Cache Coherence Protocol For The Large Scale System Of The SMP-based CC-NUMA Category

Posted on:2008-06-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z B PangFull Text:PDF
GTID:1118360242499357Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing requirements of high performance computing,the framework and implementation of high performance computer is becoming more challenge. Programmability,usability and system performance have become the object when designing a high performance computer system.The distributed shared memory multi-processor system becomes the main platform of high performance computing, which features easy programming and good scalability.As the popular scalable system approach,the CC-NUMA(Cache Coherent Non-Uniform Memory Access) is becoming the important architecture for high producity in high performance computing.There are many factors affecting CC-NUMA system performance,of which cache coherence protocol becomes the key for system scalability.Most existing CC-NUMA computers are small and with limited scale,due to the complex implementation of cache coherence.Usually,CC-NUMA clusters are used as high performance computers,which bring bad programmability.So,it is very important and necessary to design and develop a cache coherence protocol with good scalability and efficiency for the large scale CC-NUMA system.This paper researches the high efficiency implementation of the cache coherence protocol based on the Scalable Cache Coherence Multi-Processors(SCCMP),the large scale SMP-based CC-NUMA system.The main study includes designing the high efficiency scalable cache coherence protocol according to the architecture features, designing and implementing the scalable directory scheme,efficiently implementing the directory access,effectively supportting cache coherent message passing communications,and validating the protocol.Primary innovative work in this paper can be summarized as following:(â…°) We designed and implemented efficient HYbrid Scalable Cache Coherence (HYSCC) protocol,after analyzed the hierarchy and the structure features of the SCCMP system.HYSCC protocol efficiently fulfils needs of different hierarchy in SCCMP system and eases the designment and implementation of itself by taking the advantage of snooping bus protocol and directory character.HYSCC protocol ensures the system scalability based on our scalable directories.High efficiency is yielded by multiple virtual channels,concurrent unblocking process and compact massage type,HYSCC protocol supports special messages and process for the case that the dirty data become shared due to the sharement among processors in a SMP node,which reduces the dirty data written back complexity and simplifies the protocol designment in a SMP node.(â…±) We discussed the impacts of the distributed shared I/O accesses to the cache coherence,and provided special messages and cache coherence dealing procedures to support the cache coherent access with I/O attributes.Moreover,we proposed an effective method to concurrently process I/O accesses,and implemented a coherence maintenance mechanism for I/O attribute data in SCCMP system.(â…²) We did our research on the feasible and scalable directory scheme,and we proposed the Dir5NB+CCV directory scheme for the SCCMP system.The Dir5NB+CCV scheme is a combination of the modified limited pointer directory(Dir5NB) scheme and the combined coarse vector(CCV) directory scheme,which keeps both pointer representation and full-map vector representation advantages.By hybrid presentation effectively decreasing directory memory overheads,utilizing the advantage of Dir5NB scheme and CCV scheme,the Dir5NB+CCV scheme cuts down shared informantion inaccuracy,reduces excrescent invalidations and suits an efficient hardware implementation.(â…³) We proposed a dual storage module structure and dual directory cache(DC) structure to relieve the access collision and to improve directory performance.There is no special directory storage in SCCMP system,but the dual storage module structure has data and corresponding directory item accessed concurrently.To relievate memory access bottleneck,the dual directory cache structure is designed and implemented,which corresponding with the dual storage module structure and introducing cache hierarchy. This way can optimize directory access by utilizing program locality and relieve memory access pressure.Experiments show that with dual storage module and directory cache structure,the system performance is improved greatly.(â…´) We researched the effective way to integrate message passing communication paradigm into shared memory in SCCMP system.We proposed a hierarchical coherent communication model,provided communication interface in SCCMP node controller, effectively implemented a deadlock-free communication protocol and a coherent block data transfer mechanism to support the multi-domain MPI communication.(â…µ) We designed the SCCMP node controller and implemented FPGA prototype for validation.The HYSCC protocol was validated on a 4-node FPGA prototype,and an ASIC chip of SCCMP node controller was fabricated.Experiments were done on a 64-node ASIC system.All tested applications,including NAS NPB benchmark,got correct results on the system.Memory-intensive applications,such as EP,SP,FT,MG, got good scalability.Communication tests showed that the maximum communication bandwidth was more than 1.3GB/s and the maximum communication bandwidth yielded by applications can be over 1.1GB/s.(â…¶) Our research results are applicable to the large scale system of the SMP-based CC-NUMA category,and also have been successfully used in some important project.
Keywords/Search Tags:SMP, CC-NUMA, cache coherence protocol, directory scheme, directory cache, distributed shared I/O, coherent block data transfer, message passing, shared memory multiprocessor
PDF Full Text Request
Related items