Font Size: a A A

Private Last Level Cache Sharing Architecture For Many-core System

Posted on:2014-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y YeFull Text:PDF
GTID:2248330392960958Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Increasing complexity of application and constraint of powerconsumption have restrained performance of single-core and multi-coreprocessing system; many-core system with even more processing cores hasbeen laid extra attention on. However, increase of number of cores leads tonew challenges for on-chip memory architecture, the performancebottleneck of processing system: firstly, increasing number of cores resultsin larger chip scale and longer on-chip wire delay; secondly, sharing datamodel, resulting from finer-grained application parallelism in many-coresystem, results in increasing demand to each core’s memory space. Facingnew challenges, traditional multi-core memory architectures have theirown drawbacks: Shared Last Level Cache will lead to a great amount ofon-chip network communication, and its modularity and extensibilityrelative to single core is poor; Private Last Level Cache has a smallequivalent memory size, leading to more off-chip data miss; CooperativeCaching provides only few choices to data requester, which could lead tolong-distance, cross-chip data access.Aimed at new challenges brought by many-core processing systemand drawbacks of traditional architecture, a Private Last Level CacheSharing Architecture for Many-core System is proposed. The proposedarchitecture is based on Private Last Level Cache, and shares memorycapacity between different private cores, via reserving victim lines in othercores and enabling inter-core data access. The multiple reservation ofvictim line provides more choices to data requester and makes it possibleto request data from a more appropriate place. Meanwhile, in order toreduce influence to other core’s memory, fine-grained reservation control is applied in two dimensions, number of reserved copies and place ofreservation, respectively via Reserving Number Decision Algorithm Basedon Online Dynamic Threshold Adjustment and Reserving Place DecisionAlgorithm Based on Online Monitoring for Memory Usage.After introducing detailed implementation of proposed architecture,hardware cost is analyzed: overhead hardware cost for proposedarchitecture is around4.35%~8.20%. Meanwhile, performance ofproposed architecture is compared with traditional architectures under64-core system using GEM5whole-system simulation platform. Result ofperformance analysis shows: as for on-chip network communication,proposed architecture reduced by78.6%compared to Shared Last LevelCache, a little bit increased compared to Private Last Level Cache, reducedby11.9%compared to Cooperative Caching; as for off-chip memoryaccess, proposed architecture reduced by25.6%compared to Private LastLevel Cache at average and by6.5%compared to Cooperative Caching; asfor improvement to the whole many-core processing system, proposedarchitecture improved59.5%compared to Shared Last Level Cache,11.9%at best and6.2%at average to Private Last Level Cache,11.2%atbest and5.3%at average to Cooperative Caching. In conclusion, result ofhardware cost analysis and performance analysis prove that proposedarchitecture is able to efficiently improve performance of on-chip memoryarchitecture and processing system; meanwhile, analysis results also provethe effectiveness of proposed fined-grain control for reservation fromreserving number and reserving place.
Keywords/Search Tags:Many-core Processing System, On-chip MemoryArchitecture, Last Level Cache Management
PDF Full Text Request
Related items