Research On Technologies Of Cache Optimization Based On Private LLC For Chip Multiprocessors

Posted on:2018-10-28

Degree:Doctor

Type:Dissertation

Country:China

Candidate:F K Yuan

Full Text:PDF

GTID:1368330566498850

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Driven by the Moore's Law,the advantages of Chip Multiprocessors(CMPs)in the aspects of on-chip resource utility,power consumption,and design complexity over conventional single-core processors become evident and the best way to break through the performance bottleneck of CMPs.As an intermediate module between processor cores and memory,cache hierarchy provides fast access speed and avoids off-chip memory accesses,which relieves the �Memory Wall� problem.However,core count scale of CMPs challenges the on-chip cache system design severely,requiring to further increase cache resource utility,decrease cache access latencies,and reduce on-chip network traffic.Therefore,multi-core cache optimization technology becomes the key of CMP performance enhancement and a hot topic of academia.Last Level Cache(LLC)locates at the end of cache hierarchy,possessing large cache capacity and suffering long cache miss latencies,and the LLC organization is the main consideration of cache optimization research.In this paper,a study of multi-core cache optimization technology is presented based on private LLC organization.We focus on two main multi-core cache optimization issues,cache coherence and data placement,exploit the independence of private LLC resource management,and solve the challenging problems the CMPs face in the background of core count scale.The main work and contributions of this paper include:A multi-grain based cache coherence filtering protocol DP&TB is proposed in this paper.In order to solve the cache miss latency and netwrok traffic problem of Directory and Token,We propose DP&TB protocol that combines their advantages and solves their problems.DP&TB employs Directory to maintain the coherence in page granularity,exempting the coherence inspection of the blocks inside private pages,and uses Token to maintain the coherence in block granularity,filtering the broadcast to non-page-sharer processor nodes of probably shared blocks inside shared pages.Experimental results show that DP&TB provides high performance and reduces the on-chip network traffic.Besides,the storage overhead of DP&TB is only less than half of Directory and DP&TB is good in scalability.A set-granular regional distributed cooperative caching(SRDCC)mechanism is proposed in this paper.To solve the problem of Cooperative Caching(CC)studies in the aspect of network traffic surging,we propose SRDCC to increase the adaptability of CC to the environment of core count scale.SRDCC adopts a new method to measure set access pressure in the light of exclusive LLC,manages the distributed regional set-grained receiver information,propose,and is capable of finishing the receiver tracking rapidly.Experimental results show that SRDCC reduces the network traffic of CC effectively and improves scalability.A replication-aware cache management(RACMan)mechanism for CMPs with private LLCs is proposed in this paper.In order to solve the replication problem of private LLC,we propose RACMan to improve former static replication strategy of CC studies.RACMan relates the Block Access Pattern(BAP)features of processor cores to the reusability of replicas in LLC,dynamically adjusts the LLC insertion policies on replicas to give them different initial positions of the LRU chain and the chances of survival in LLC.Experimental results show that RACMan provides good scalability for private LLC CMPs in performance,network traffic and storage overhead.A reusability and anti-interference predictable cooperative caching(RAPCC)mechanism is proposed in this paper.To increase cache sharing ability of private LLCs,we propose RAPCC to sovle complex and difficult problems of private LLC resource sharing.RAPCC resorts interference injection experiments that relate the Reuse Position Distribution(RPD)and the reusability and anti-interference prediction of LLC,employs an online RPD category identifying algorithm to recognize periodical RPDs and determine the spilling roles of private LLCs dynamically.RAPCC uses a novel spilling role that spills and receives synchronously,facilitating private LLCs to allocate on chip storage resources more rationally and amplifies the receptivity of receivers with specific RPD to extend spilled lines' surviving time for on chip retrieves by bypassing the receivers.Experimental results show that RAPCC inceases cache capacity sharing efficiency and improve system performance.This paper improves the adaptability of CMPs to the environment of core count scale in the two aspects of cache coherence and data placement,provides design options of better scalability for chip multiprocessor architecture,possesses valuable research significance,and enhances the practicability of CMPs with private LLCs.

Keywords/Search Tags:

Chip Multiprocessors(CMPs), Cache Optimization, Private LLC, Cache Coherence, Data Placement

PDF Full Text Request

Related items

1	Cache Coherence Techniques For Chip Multiprocessor Architecture
2	Research On Memory Simulation And Optimizations In CMPs
3	Research On Hybrid Cache Architecture Generation And Access Mechanism For On-Chip Multiprocessors
4	Research On NVM-based Hybrid Cache Architecture For 3D Chip Multiprocessors
5	Assessment of cache coherence protocols in shared-memory multiprocessors
6	Exploiting multi-threaded application characteristics to optimize performance and power of chip-multiprocessors
7	Rcsarch And Design Of Cache Coherence For Mu11i-core Processors
8	High Performance Network-on-Chip For Cache Coherence Optimization
9	Research On Cache Coherence Protocols Based On Data Sharing Characteristics
10	Co-optimization Of On-chip Interconnects And Cache Coherence For Multi/Many-core Systems Based On Multithread Application Characteristics