Font Size: a A A

A New Layered Optimization Technology For Distributed Storage

Posted on:2022-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:C J DengFull Text:PDF
GTID:2518306722972229Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Compared with centralized storage,distributed systems have the advantages of horizontal scalability,high availability,and high performance,and can calmly cope with rapidly expanding massive amounts of data.Generally,a newly-built distributed storage system contains a variety of storage media: low-speed,large-capacity and cheap mechanical hard disk HDD,faster,smaller capacity,higher price SATA SSD,and very fast speed,smaller capacity,and cost.Very high NVMe SSD,etc.Therefore,the distributed storage system brings complexity,its scale is larger,the structure is more complicated,and it often requires specialized operation and maintenance personnel.Software-defined storage(SDS)is a system architecture that separates storage software and hardware,strips off the dependence of software on proprietary hardware,abstracts distributed storage software,and can adapt to various hardware or hardware combinations.What brings is: the same distributed storage system software,its hardware environment and scale may be very different.It is difficult to have a universally applicable operation and maintenance strategy for complex and heterogeneous large-scale storage clusters.In this way,different clusters need targeted parameter adjustment and performance optimization.This article mainly focuses on the performance optimization of the storage engine-based tiered storage system.Based on the ant colony algorithm,the storage tiered scheduling is studied,and the throughput rate is used as a performance indicator for parameter analysis and performance optimization.The goal is to improve the distributed storage system represented by Ceph The overall performance.The main research is as follows:1.Analyze the current mainstream vendors' distributed storage tiered solutions and mainstream tiered storage solutions.Based on the Ceph distributed storage system,the concept and scheme of tiered storage based on the storage engine are proposed to maximize storage space,avoid space waste,and reduce the impact of load changes.2.The hierarchical storage scheduling of the distributed storage system is abstracted as the 0/1 knapsack problem,and the hierarchical storage scheduling method of the distributed storage system based on the ant colony algorithm is proposed,and the feasibility analysis and mathematical model of the method are given.And the algorithm flow,analyze the algorithm parameter optimization,and apply the hierarchical scheduling algorithm based on the ant colony algorithm to the design and implementation of the hierarchical storage system.3.For different tiered storage schemes and tiered replacement algorithms,develop detailed performance test schemes,complete different storage type tests,and compare and analyze various schemes to give a reference for the current Ceph tiered scheme selection.Finally,based on the Ceph distributed storage system,a hierarchical scheduling scheme for storage engines based on ant colony algorithm is implemented.On the basis of hierarchical storage,an experimental environment for hierarchical scheduling performance testing is built.By comparing other storage tiering schemes and the native Ceph distributed storage system based on storage pool tiering,it shows that the ant colony algorithm-based tiered storage scheme proposed in this paper has obvious performance improvement and cost advantages.
Keywords/Search Tags:distributed systems, storage systems, tiered storage, data migration, ACA
PDF Full Text Request
Related items