Font Size: a A A

Configuration Optimization Of RocksDB Storage Engine Based On Machine Learning

Posted on:2020-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:K Y LuoFull Text:PDF
GTID:2428330575952506Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of large-scale distributed storage technology,the research on the transformation of traditional relational database has become a hotspot,and many new database systems with RocksDB as the storage engine have appeared.RocksDB is a key-value storage system based on log structure merge tree.It has the advantage of converting random I/O into sequential I/O,and is the preferred choice of large-scale data storage.However,RocksDB also has shortcomings in terms of read performance and storage parameters that can-not adapt to workload changes.In extreme environments,RocksDB's default parameter configuration can lead to thread blocking or even write stopping due to the long background merge operation queue,resulting in significant perfor-mance degradation.In terms of read performance,read operations in RocksDB may occur at multiple levels,and can not be mitigated independently by adding indexes and Bloom filters,resulting in inevitable performance losses.Therefore,although RocksDB has significant writing advantages,these problems also restrict its further application.In order to solve the above problems,this paper aims to build a workload-aware storage engine and takes machine learning technology as the core method to study the storage parameter configuration and active cache of hot data in RocksDB.The main contributions of this paper are as follows:1)In order to improve Rocks DB's perception of general workload and au-tomatically adjust storage parameters,an intelligent parameter tuning module based on reinforcement learning is designed and implemented.The module cap-tures the difference and dynamic change of workload,builds environment model with a model-based reinforcement learning framework,which effectively captures the complex relationship between workload and storage parameters,and reduces the loss of system performance caused by workload changes.The experimental results show that the performance of the parameter tuning system is better than that of the default parameters when the workload changes significantly,and the overall performance in read and write mixed scene is improved by about 8%.2)In order to improve the performance of read-oriented workload and fit the distribution of hot data,this paper designs and implements a hot data active caching module based on incremental learning with predictive analysis.This module models the data according to the access mode,predicts the potential hot data and actively caches it into memory,so that the hot data is always in the lower layer of the log structure merge tree,which reduces the I/O times in the storage medium.The experimental results show that the active caching model has good performance when the read workload mode is significant.
Keywords/Search Tags:RocksDB, Reinforcement Learning, Intelligent Tuning, Active Caching, Incremental Learning
PDF Full Text Request
Related items