| Key-value databases are widely used in data-intensive applications and fields such as e-commerce and social networks due to their high performance,scalability,and flexibility.LSM-tree(Log Structured Merge tree)is the most commonly used storage architecture in key-value storage,with high write performance but significant write amplification due to Compaction operation.NAND flash-based solid-state drives(SSDs)are widely used in modern computer systems because of their low power consumption,high vibration resistance,and fast random access.However,directly migrating LSM-trees from HDDs to SSDs cannot fully exploit the parallelism of SSDs,and cascading write amplification can have adverse effects on their lifespan.The key-value separation idea largely alleviates the write amplification problem,but VStore’s garbage collection mechanism can significantly impact normal read and write performance.Meanwhile,range queries will be transformed into a large number of random reads on VStore,leading to poor performance of range queries.To address the issues of excessive garbage collection overhead and poor range query performance in existing key-value separation systems,a new system architecture design is proposed by combining the existing Compaction mechanism,VStore structure,and garbage collection strategy.(1)The main idea behind the Collaborative Compaction and Handle Delay Update mechanism is to add DropKeys to the end of the corresponding file in the VStore during Compaction,so that effective validity checks of Values can be performed during garbage collection without traversing the LSM-tree.By updating Value handle information only when querying data,the impact on normal user read/write operations and write amplification can be minimized as much as possible.(2)VStore is partitioned by key range,and a combination of VTable and SVTable is designed to store key-value pairs,achieving sequential reads and writes on VStore,and improving range query performance by fully utilizing the internal parallelism of SSDs.To validate the effectiveness of the design,a key-value separation storage engine RWSKV is implemented based on BadgerDB.Finally,the performance of the system is evaluated through YCSB benchmark tests,and a series of comparative tests are conducted against golevelDB,HashKV,and BadgerDB.The experimental results show that compared to traditional key-value separation systems,RWSKV’s random read and write throughput is improved by about 15%,range query performance is improved by about25%,and the impact of garbage collection on normal read and write performance is reduced by about 20%. |