Font Size: a A A

Research On Performance Optimization Method On Highly Reliable Storage Arrays

Posted on:2020-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:X S HuangFull Text:PDF
GTID:2428330620459975Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In cloud storage and big data processing systems,highly reliable storage arrays is a popular choice to provide high reliability with low monetary cost.Highly reliable storage arrays often guarantee its reliability in two ways.One is to use erasure codes to ensure data recovery when several disks fail at the same time,and the other is backup.When we use disk arrays tolerating triple disk failures(3DFTs),due to the complicated coding,partial stripe writes which write several continuous data blocks causes a large amount of parity modifications,which becomes a write performance bottleneck.When we use backup systems,deduplication is one of the typical applications in backup system.Since the index table needs to be frequently transferred from the external storage to the memory,the amount of data transfer is huge and the overall performance is limited.In order to solve the above problem on low partial stripe write performance,in this thesis,we propose an optimized partial stripe write(OPS)method,which reorganizes the distribution of write data blocks to share several partial parities,thereby the overall I/O performance can be improved.The OPS method can effectively reduce the number of modified parities.To illustrate the effectiveness of our OPS method,we use Disksim to evaluate several different partial stripe write methods through simulation.The result shows that,compared to traditional partial stripe writing methods,OPS can reduce the average response time by up to 37.21%,and decreases the number of write operations by up to 26.22%.In order to solve the problem on speeding up the processing for data deduplication applications in storage arrays,in this thesis,we propose an optimized method based on a fusion of computing and storage resources,which provides embedded computing cores to accelerate the nearest data processing.It reduces data movement from the storage devices to the memory.This method can efficiently decreases the data paths from storage devices to CPU registers,which improves the performance of fingerprint comparisons of data deduplication applications.To illustrate the effectiveness of our method,we used Disksim to evaluate various deduplication workloads through simulation.The result shows that,when we use index in memory containing 2 millions fingerprints and a B+tree which has 250 fingerprints for every leaf,our method can reduce the running time by up to 285.74 times and decreases the number of movements of the data by up to 125.13 times compared to the traditional deduplication applications.
Keywords/Search Tags:Erasure Codes, Partial Stripe Write, Active Disks, Deduplication, Performance
PDF Full Text Request
Related items