Font Size: a A A

Key Technologies Research On Ceph-based Distributed Cold Storage System

Posted on:2024-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ZhangFull Text:PDF
GTID:2558306914464704Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the widespread use of digital intelligent devices and the improvement of social informationization,the amount of data has grown rapidly,promoting research on efficient and secure large-scale data storage and management.However,as the data size expands,the storage hardware and operation and maintenance costs also increase,making it a challenge to reduce these costs in the storage field.This thesis aims to implement a distributed cold storage system using Ceph and to design its critical component,the metadata key-value database.The system will categorize data into two types,cold and hot data,based on the last access time.Cold data,which has a low access frequency,can compromise system performance to achieve lower storage costs.On the other hand,hot data,which has a high access frequency,necessitates maximum performance to meet the requirements of high-throughput and low-latency applications.The main idea of this system is to turn individual hard drives on or off as needed.To meet this requirement,low-cost customized storage servers are used,and a hard drive power control module is designed on the storage server to manage this process.Additionally,the system proposes concepts such as virtual pools,work pools,and cold pools based on the Ceph architecture,and implements data redirection and file cross-pool storage to centralize cold data storage and facilitate the implementation of hard drive on/off strategies.However,the Ceph-based cold storage solution led to a significant increase in cluster metadata by two orders of magnitude.To address this issue,this system designed a new key-value database,SkipDB.The article first optimized the core data structure of the key-value database,LSM-Tree,and designed a partially ordered data organization method called sTree based on key-value separation and its data merging method,effectively reducing the read-write amplification problem of LSM-Tree.Subsequently,the basic infrastructure of SkipDB was designed and implemented.SkipDB adopts a fine-grained key-value separation strategy to differentially manage unordered,partially ordered,and fully ordered key-value pairs of different sizes,balancing the performance of SkipDB’s read,write,and scan operations.
Keywords/Search Tags:Distributed Storage, Cold Data Storage, Key-Value Database, Log-Structured Merge-Tree
PDF Full Text Request
Related items