Font Size: a A A

Research On Key Technologies Of Memory Architecture For In-memory Computing

Posted on:2019-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:S LiFull Text:PDF
GTID:1368330611993108Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Efficient data storage and real-time analysis and processing are urgent needs in the era of big data.Traditional disk-based storage systems cannot provide timely responses due to very large access delays.In-memory computing technology makes full use of largecapacity memory for data processing,reducing or even avoiding I/O request,greatly improved the ability of big data processing.However,due to the ”storage wall” and ”power wall” issues,DRAM-based memory systems cannot meet the growing demand for highcapacity,high-performance and low-energy storage for big data applications.In addition,for volatile DRAM,the persistent storage of data relies on external memory,and then high-latency I/O requests cannot be completely avoided.Moreover,DRAM has many disadvantages such as high power consumption,low storage density,and need for refresh operation.In the face of the deficiencies of traditional storage technology,emerging nonvolatile memories provide an important way and opportunity for in-memory computing.Non-volatile memories generally have excellent characteristics such as large capacity,low static power consumption and non-volatileity,and have received extensive attention in the academic community and industry.How to bridge the performance gap between computing and memory,how to efficiently ensure data consistency in persistent memory,how to migrate memory storage and data processing systems to persistent memory systems,and take full advantage of features of non-volatile memories,are the challenges of in-memory computing technology.We focuse on the challenges of in-memory computing technology,and study the key technologies of memory architecture.The main work and innovations are as follows:Path prefetching mechanism for index accesses for in-memory database.In-memory databases(IMDBs)keep all working data in memory,and the amount and latency of the imitation become a decisive factor affecting system performance.Through profiling a popular IMDB,we find that the index access of IMDB accounts for the majority of the last-level cache misses,and thorough analysis finds that existing prefetch mechanisms cannot effectively prefetch for index accesses.We also find that searches of key are the most common operations in index accesses,while a search corresponds to a traversal path from root to target leaf nodes.Based on the key observation that adjacent keys in an ordered index follow similar traversal paths,we propose path prefetching that records the mapping between index keys and traversal paths.When the same or similar index keys are searched again,the mapping information recorded generate prefetch request accurately.In this work,the path prefetcher is designed in the cache controller.The concept of partial hit is proposed to support the prefetching of adjacent key value search,and a mechanism for updating the mapping information to reflect the change of the ordered index structure is proposed.The experimental results show that for ordered index lookup,path prefetching has a small storage overhead,and performance is improved by 27.4%,and path prefetching has better scalability than traditional prefetching mechanism for actual large-scale workloads.Persistent memory with efficient crash consistency mechanism.Like traditional disk-based storage systems,persistent memory also needs to ensure crash consistency,that is,when the system is down,it can ensure the consistency of the stored data.After the system is restored,it can still provide data access services.However,traditional consistency mechanisms rely on order constraints of persistence operations.They relies on instructions such as cache flush and memory fence.These instructions eliminate the possibility for memory system to merge and redorder memory requests,and increase the write traffic of persistent memory.The memory persistency theory proposes the strand persistence model that greatly reduces the order constraints of persistence operations,but there is no specific system implementation.This work first extends the strand persistence model.By ensuring the atomicity of the strand,the order limit of the internal persistence operation of the strand is further eliminated,and a persistent memory system with efficient data consistency guarantee is designed based on the extended model.By tracking and guaranteeing the atomicity of each strand,in the event of a system failure,the strand submitted in the strand buffer can be written back to persistent memory with the support of the supercapacitor power supply.Experiments have shown a performance improvement of 6.6% compared to the state-of-the-state persistent memory system,and performance exceeds the system without consistency guarantee.In addition,due to the merge effect of the strand buffer,the write traffic to the persistent memory is reduced by 30%.Non-volatile memory system that supports row and column access.There are many researches on non-volatile memories,most of which focus on its non-volatile applications,or to solve the problems of read and write asymmetry,large write latency,limited lifetime,etc.,but ignore the symmetry of crossbar structure adopted by most nonvolatile memories.Based on the symmetric cross-array structure,we propose a novel non-volatile memory architecture called RC-NVM that supports both row and column access,to accelerate applications with hybrid row and column accesses.This work introduces the RC-NVM architecture design,proposes a special requset scheduler design,and design the cache architecture to solve the cache synonym and cache consistency.This work also focuses on the deployment of the in-memory database on the RC-NVM system,and proposes a group cache optimization technique for wide-field and multi-field access problems.The experimental results show that memory access performance of IMDBs is greatly improved by 14.5 times with an additional area overhead of 10%.For generalpurpose matrix multiplication,RC-NVM can naturally support SIMD operations,which is 19% higher than the best tiled method.
Keywords/Search Tags:In-memory Computing, In-memory Database, Prefetcing, Non-volatile Memory, Crash Consistency, Crossbar
PDF Full Text Request
Related items