| With the rapid increase of data volume,main memory systems with high performance,large capacity,and low power consumption are required to support big data processing.Emerging non-volatile memory(NVM)is the most promising candidate for next-generation main memory due to the advantages,including DRAM-like performance,high storage density,low static power consumption,and non-volatility.The NVM-based main memory is called persistent memory.Persistent memory faces the problem of how to recover from data corruption caused by incomplete updates after system crashes(i.e.,the crash consistency problem).To solve this problem,current studies migrate crash consistency guarantee methods,such as logging and checkpointing,from external storage systems to persistent memory systems.However,existing methods do not fully consider the memory access patterns of applications and the characteristics of persistent memory,resulting in large performance overhead.The overhead is unacceptable,because due to the low access latency of persistent memory,system performance is sensitive to the overhead of crash consistency guarantee.To address the above issues,this thesis designs the methods of efficiently ensuring crash consistency by exploiting the memory access patterns of applications and the characteristics of persistent memory,which provides support for building persistent memory systems with high performance,large capacity,and low power consumption.To address the problem of the significant performance degradation caused by using logging to ensure crash consistency in write-intensive scenarios,a log reduction method based on the memory access patterns of applications(CCHL)is proposed.This method reduces log writes by compressing and consolidating log entries,thus improving the performance of the persistent memory system.According to the memory access patterns of applications,CCHL analyzes how data are updated by transactions within applications.The first observation is that the values of most data updated by transactions are not changed.Such data are called clean data.Based on this observation,an intra-transaction update-pattern-based log compression mechanism(LCOMP)is designed.LCOMP compresses log entries by discarding clean data,which reduces log writes.To track the relationship between original data and compressed data,a modified flag is added to each log entry.The flag size is adjusted according to the demand of applications,which reduces the overhead caused by adding additional flags.The second observation is that a large number of adjacent transactions update the same data.Based on this observation,an inter-transaction update-pattern-based log consolidation mechanism(LCONS)is designed.LCONS reduces the number of log entries written to persistent memory by consolidating the log entries of adjacent transactions that update the same data.The experimental results show that CCHL improves transaction throughput by 47.8%,reduces persistent memory write traffic by 35.4%,and reduces memory system energy consumption by 21.0% compared with the existing hardware redo logging design.To address the problem of the large performance overhead caused by writing redundant data into logs in persistent memory systems,a log morphing method based on the necessary conditions of recovery(Mor Log)is proposed.This method avoids writing redundant data by dynamically changing the types and the coding strategies of log entries,thus reducing the performance overhead of logging.According to the characteristics of persistent memory,Mor Log analyzes which log data are used to recover in-memory data after system crashes.The first observation is that the oldest and the newest data values within log entries are used during recovery,but the intermediate values are not.Based on this observation,a log-type morphing mechanism(MLOG)is designed.MLOG avoids writing the intermediate values into logs by dynamically changing entry types when creating log entries.The second observation is that in log entries,the data with bit flips are used during recovery,while the data without bit flips are not.Based on this observation,a log-coding morphing mechanism(SLDE)is designed.SLDE reduces write costs by dynamically changing coding strategies when writing log entries into persistent memory.SLDE provides a differential log data compression strategy that reduces the data without bit flips in logs by compressing log entries according to the bit flip patterns of data.The experimental results show that Mor Log improves transaction throughput by 107.3%,reduces persistent memory write traffic by63.4%,and reduces write energy by 56.4% compared with the combination of the existing hardware undo+redo logging design and the existing data coding design.Mor Log can work with CCHL.Compared with Mor Log,Mor Log+CCHL improves transaction throughput by up to 13.1% and reduces write traffic by up to 24.7%.To address the problem of the substantial overhead caused by ensuring crash consistency with software-transparent checkpointing,a fast software-transparent checkpointing method based on operation decoupling(NICO)is proposed.This method relies on operation decoupling to mitigate the performance degradation caused by the persist operations for creating checkpoints,thus reducing checkpointing overhead.NICO analyzes existing checkpointing methods,and observes that there are two key factors that affect checkpointing overhead.The first is the impact of persist operations on memory access operations.The second is the amount of data that need to be persisted when creating checkpoints.To reduce the amount of data,persist operations are decoupled from checkpoint creation.Persist operations are executed in parallel with memory access operations before starting to create checkpoints.To reduce the impact of persist operations on memory access operations,the two types of operations are decoupled from each other.Persist operations use the execution path that bypasses the caches,while memory access operations use the path that goes through the caches,which mitigates resource contention.The experimental results show that under array-access workloads,NICO improves instructions per cycle(IPC)by 196.9% and reduces the time spent on creating checkpoints by 99.5% compared with the existing software-transparent checkpointing design.Under transaction-intensive workloads,NICO improves transaction throughput by 68.9% and reduces the time spent on creating checkpoints by 99.1%.Under standard workloads,NICO improves IPC by 155.3%. |