Font Size: a A A

Research On Energy Optimization Method For On-chip Cache Memory Subsystem

Posted on:2018-11-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:F F ShenFull Text:PDF
GTID:1368330512986010Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recently,with the rapid development of science and technology,electronic products have been widely used and will gradually spread,such as smart phones,computers,wearable devices,smart home equipment and unmanned aerial vehicles.These procucts have brought great convenience to people's lives.However,they are lack of endurance problem,which also gradually becomes highlights.To this end,it is obvious that the low power consumption design is the only way for the development of green intelligent electronic device.The most important parts of the intelligent electronic device are the processor and memory,which are usually the main part of the power consumption budget.With the development of semiconductor technology,the running speed of the processor is faster and faster,while the access speed of main memory is relatively slow,and the performance gap between them is gradually increasing.Memory wall problem is increasingly serious.Recently,the emergence of multi-core processor architecture increases the pressure of data access,and the on-chip cache can alleviate the problem of mismatch of access speed to some extent.Therefore,the on-chip cache is widely used in various computing devices.Traditional on-chip cache is usually implemented with SRAM technology,because SRAM has the advantages of fast access and long lifetime.However,with the further reduction of the semiconductor feature size,the leakage power consumption of SRAM will increase rapidly under the traditional CMOS technology,and the leakage power consumption gradually occupies the dominant position.And,for the large capacity on-chip cache,SRAM memory cell will consume a large number of chip area.The on-chip cache based on SRAM technology has been unable to meet the requirements of low power consumption and high performance.Recently,the emergence of emerging non-volatile memory(NVM)provides a new solution for computer storage technology.NVM is a promising alternative to replace the traditional storage technology at different levels,because NVM has the advantages of low leakage power consumption,high storage density and non-volatility.In order to make the best use of NVM's advantages,in recent years,many researchers have proposed to use the NVM technology as the on-chip cache.However,as the manufacturing process and design principle of NVM are different from SRAM,NVM usually have the same weakness,such as high write energy consumption,long write latency and limited lifetime of the memory cell.Traditional cache optimization approaches are not suitable for the development of new technologies.Therefore,for the sake of using the NVM as on-chip cache,we need to make the best use of the advantages of NVM while overcoming the problem of NVM's write operations.In this paper,we consider from the perspective of the storage architecture and then use cache partitioning,feedback learning,wear leveling and data allocation technology to optimize the cache energy consumption,respectively.Specificly,our works are summarized as follows:1)Cache energy optimization method based on cache partitioning technologyCache partitioning is a promising technique to improve cache management mechanism.To explore the advantages of cache partitioning technique,the most existing schemes either consider performance improvement or address energy reduction.These techniques focus on specific application domain or single optimization goal,which limit their extensive use.To address these issues,this paper proposes a novel schme named Reuse locality aware Cache pArtitioning scheme(ROCA).It is used to reserve more high reuse locality blocks in the last level cache.The basic ideas are as follows:First,we partition the cache into live portion and dead portion according to the cache access behavior.Then we design a reuse locality reservation algorithm to maintain the cache fraction.Finally,reuse locality guided block placement policy is proposed to further improve cache efficiency.Evaluation results show that ROCA achieves notable effect for single-threaded,multi-programed and multi-threaded workloads with an acceptable hardware overhead.2)Non-volatile cache energy optimization method based on feedback learningA common observation for last-level cache(LLC)is that a large number of cache blocks have never been referenced again before they are evicted.The write operations for these blocks,which we call dead writes,can be eliminated without incurring subsequent cache misses.To address this issue,a quantitative scheme called Feedback learning based dead write termination(FLDWT)is proposed to reduce the dead writes in the LLC.The basic ideas are as follows:First,FLDWT dynamically learns the block access behavior by using data reuse distance and data access frequency,and then we build a quantitative evaluation model of cache blocks.Second,we design the cache block classification algorithm and classify the blocks into live blocks and dead blocks.Lastly,FLDWT terminates dead write block requests and improves the estimation accuracy via feedback information.Compared with the most related work,experimental results show that our scheme reduces a large number of dead blocks in the LLC and thereby achieves energy reduction.3)Non-volatile cache energy optimization method based on wear leveling technologyThe write operation on cache is unbalanced distribution in the cache set and there exist write variations among inter sets and intra sets.These cache access characteristics will lead to uneven wear for each cache storage cells.Unfortunately,the wear leveling approaches for NVM based main memories could not be simply used to NVM based on chip caches because main memories only have inter set variations.Meanwhile,most of the existing cache management policies are write variation unaware at present.This situation might result in unbalanced write traffic to every cache cells,which causing heavily written cells will fail much earlier than the others.To solve the write endurance problems,this paper proposes a novel technique named SRAM assistEd weAr Leveling(SEAL)for non-volatile caches to guide the allocation of cache blocks.SEAL contains two algorithms:Write Variation-aware blOck Migration algorithm(WVOM)and Threshold gUided Block migration algorithm(TUBI).The main idea of the SEAL scheme focuses on the high write variation cache set and the high write intensive cache storage cells.SEAL attempts to reduce the write pressure of these cache cells.The basic ideas are as follows:First,the WVOM algorithm is used to improve inter set wear leveling by detecting the write variation of cache set and then migrating the blocks in write intensive cache sets to the SRAM.Second,the TUBI algorithm aims to migrate the high write locality cache blocks in a cache set to achieve the intra set wear leveling.Compared with the state-of-the-art scheme,the experimental results show that SEAL reduces a large number of write operation on NVM and thus achives energy reduction and lifetime improvement.4)Hybrid cache energy optimization method based on data allocation technologyHybrid cache consisting of spin-transfer torque RAM(STT-RAM)and static RAM(SRAM)has been proposed as last level cache(LLC)for energy efficiency recently.The existing hybrid cache optimization methods usually use the migration mechanism to migrate the frequently written cache blocks from NVM to SRAM.Then the the write operations on NVM are reduced.However,the migration mechanism usually brings the migration overhead,and the cache block has been accessed many times before it is migrated.These energy consumption and performance costs can not be ignored.Aiming at this problem,this paper proposes a novel statistical behavior guided block placement(SBOP)scheme.The basic ideas are as follows:First,we estimate the cache block characteristics based on the statistical behavaior of data read/write re-references.Second,we record these features and design the SBOP architecture.Lastly,we design a theoretical analysis model to optimize the energy consumption and guide block placement in both SRAM region and STT-RAM region.Experimental results demonstrate that the cache blocks can be placed in the appropriate region in a low-power manner,and thereby the dynamic energy consumption and execution time are reduced.
Keywords/Search Tags:Cache Energy Optimization, On-chip Cache, Non-volatile Memory, Hybrid Cache
PDF Full Text Request
Related items