| With the rapid development of Internet technology,various applications have put forward higher requirements for the stability and data processing performance of key-value database storage systems.Traditional key-value database storage systems,which are composed of static data structures and heuristic algorithms,often rely on manual tuning methods to maintain stable performance,such as index optimization and buffer parameter tuning.Although manual tuning can effectively improve the system performance,there are still some problems as follows.First,in the era of big data,the amount of data is increasing and the number of database instances is increasing,which leads to a sharp increase in the cost of manual tuning.Second,workloads are becoming more complex and diverse,which cannot be tuned by humans in a timely and effective manner.Third,traditional key-value database storage systems are difficult to meet the needs of big data because they cannot solve the hot spot problem(that is,only a small part of the system data is frequently accessed in some highly skewed workloads).Therefore,in order to solve the above problems existing in the traditional key-value database storage system,this thesis carries out the following two aspects of work.(1)Existing key-value database storage systems lack hot-spot awareness,which leads to poor performance and unreliability under highly skewed workloads.This thesis proposes an adaptive hot-spot aware hashing index model,which implements a high performance hash table based on the summary information of Key values.Firstly,the summary information of the Key was used to replace the Key value to compress the storage space of the Key,and the data structure of the bucket in the hash table was optimized,which effectively increased the available memory space of the hash table,greatly avoided hash conflicts from the root,and improved the access efficiency of the hash table.Secondly,the CPU data-level parallelism technology and CPU Cache Line were used to optimize the hash table probe operation to avoid the performance degradation caused by multiple probes when using the open address method to solve hash collisions.Finally,in order to solve the problem that Key values cannot be accurately compared due to summary information and additional disk I/O is required,an adaptive Key value scheduling algorithm is designed,which dynamically adjusts the storage location of Key values according to the current available memory size,hash index load and access hotspots.Experiments on the YCSB simulation dataset show that,compared with the most advanced hash table,the adaptive hot-spot aware hash index improves the speed by 1.2 times under the same memory usage.(2)The existing key-value database storage systems manage the buffer by manually configuring parameters,which leads to the lack of adaptability and the performance bottleneck of the buffer.This thesis proposes a method of adaptive buffer management.Firstly,the read ahead algorithm and hot and cold cache replacement algorithm were deeply studied,and the specific effects of read ahead threshold and hot and cold ratio on the two algorithms were clarified.Secondly,a parameter evaluation process was designed by means of FIFO historical queue and adding auxiliary fields to evaluate whether the current parameter was too large or too small in real time.Finally,a buffer adaptive model is designed,which uses the native performance metrics of key-value database storage system to realize reasonable adjustment of parameters.900 groups of simulation experiments were carried out on the FIU dataset.The experimental results show that,compared with the system baseline read ahead algorithm and hot and cold cache algorithm,the two adaptive algorithms can effectively reduce disk I/O by 8% and increase cache hit rate by 24%without sacrificing the running speed of the algorithm.Compared with the latest cache replacement algorithm,the adaptive hot and cold cache replacement algorithm improves the speed to 1.6 times under the premise of ensuring the cache hit rate. |