Font Size: a A A

Parity Facing NoC Shared Cache Patitioning

Posted on:2016-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:F ChengFull Text:PDF
GTID:2348330536967743Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Influenced by the physical limit of the transistor,it is more and more difficult to improve the processor's performance by just increasing the processor frequency,the processor pipeline levels and the width of the instruction.Concequently,the Multi-Processors Chip(CMPs)has become a hot research topic in the academic and industrial fields.According to statistics from the authoritative organization — The International Technology Roadmap for Semiconductors(ITRS),the number of cores of a single processor in electronics devices will be increased to thousands in the next decades[93].With the explosion of the number of cores,the scalability of bandwidth,frequency and power consumption of the traditional bus interconnect are very poor.A large number of concurrent computing and storaging work on chip only allow concurrent pipelined communication mode but not the sequential communication mode.As a result,given its excellent scalability,the Network-on-Chip(NoC)has been becoming a better solution to connect those resources on a single chip.According to statistics,the footprint of cache on a single chip has achieved 50% of the entire chip,and this trend is continuing.With the expansion of the scale of the chip,the centralized cache organization has become the bottleneck of the performance and power consumption of the System on Chip(SoC)because of its large accessing delay,serious accessing contention and poor scalability.In correspondence with this,with the advantages of good scalability,lower competition and balanced delay the shared distributed cache architecture and the NoC architecture complement each other very well.In sum,the SoC has gradually developed from a single core which based on bus or a small number of multi-core architecture to the multi-core architecture which rely on NoC and distributed shared cache.In this case,it's reasonable that the Non-Uniform Cache Access(NUCA)has become the mainstream architecture of today's multi-core processor design.However,with the rapid increase of the scale of the multi-core chip,the problem of the parity of cache accessing and the Quality of Service(QoS)of different nodes on chip is becoming more and more serious.In order to solve this problem,this paper focuses on the QoS problem of NUCA architecture,and has carried on the following several aspects of this problem:1)According to the cache accessing feature of today's NUCA architeture we have modelled the latency of different nodes' accesses on the chip.Specifically,this paper has presented a delay model of cache accessing,which estimates the average accessing delay of each node on the chip accurately;2)The paper is focusing on the study of the distributed shared cache partition,and it provides a innovative method to solve the problem of increasing cache accessing partiality problem in the NUCA architecture;3)We have implemented the optimized cache partitioning architecture on several mainstream simulators.We have performed the whole system evaluation on the targeted architecture.In order for the performance analysis of the system we have run several current popular benchmark suite—SPEC CPU2006,PARSEC—on our design.By analyzing our experimental results sufficiently,we verified and evaluated the system performance of the cache partitioning optimization architecture.
Keywords/Search Tags:Non-Uniform Cache Access, Parity, Last Level Cache, Linear Programming, Multi-core
PDF Full Text Request
Related items