Font Size: a A A

The Research And Implementation Of Storage Mechanism For Hot And Cold Data

Posted on:2022-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:J X XuFull Text:PDF
GTID:2518306524480504Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The proliferation of data volumes has led to increasing storage requirements,enterprises are optimizing storage space by creating tiered storage architectures.To improve space utilization in a tiered storage system,enterprises divide data into hot data and cold data depending on the frequency of access,and match data access characteristics to storage device performance to avoid storage pollution caused by cold data residing in the high-performance storage medium.The identification accuracy of hot and cold data will directly affect the data access efficiency of a tiered storage system.Therefore,it is of great significance to study the storage mechanism for hot and cold data.The study of hot and cold data originated from the hierarchical design of computer caches.Most of the classical algorithms in cache replacement decide whether data needs to be swapped out based on a single characteristic of the data,such as the access time and access frequency.These algorithms are partial in considering the characteristics of data access and cannot well adapted to changes in data access patterns.This paper presents an in-depth analysis of the shortcomings of the traditional cache replacement strategy,proposes a method for determining hot and cold data based on the multidimensional characteristics of the data,and further designs a storage strategy for hot and cold data based on this method.In view of the limitations of the traditional strategy,this paper quantifies the temperature of the data based on the characteristics of three aspects: access time,access frequency and data relevance,and estimates the future access patterns of the data based on this temperature to realize the determination of hot and cold data.To address the problem that data storage cannot adapt to changes in data access patterns,this paper proposes a data migration strategy based on data temperature,which dynamically adjusts the storage threshold of the hot database according to the distribution of data temperature in the database,and migrates data exceeding the threshold to the cold database to complete the separation of hot and cold data storage.Finally,based on Redis and HBase database,this paper implements the prototype of the cold and hot data storage strategy designed,and completes the precision test of the cold and hot data determination strategy as well as the function and performance test of the storage strategy.The results verify that the proposed strategy has better identification accuracy than LFU and LRU algorithms,ensures the read and write availability of Redis database after embedding hot and cold data migration strategy.In addition,the read-write performance of Redis database has been improved for various data access modes,reaching the expected goal.
Keywords/Search Tags:Hot and cold data, cache replacement, data migration, tiered storage
PDF Full Text Request
Related items