Font Size: a A A

I/O Performance Optimization For Flash Based Fault-tolerant RAID Systems

Posted on:2019-08-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiangFull Text:PDF
GTID:1368330551956897Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Because of large capacity,high concurrency,and good reliability,flash-based SSDs are widely used in storage systems such as cloud storage platforms and data centers to meet the growing demand for data information in the era of large-scale storage.However,on the one hand,due to limitations in I/O contenions,garbage collection,etc.,the utilization of SSD chips is not high,which seriously affects the I/O performance of the entire flash array.On the other hand,storage technologies such as MLC/TLC have increased the capacity of flash memory chips,while sacrifice their reliability.It is also urgent to maintain the fault tolerance of flash arrays.A new trend is that SSDs use chip-level RAID-5 to ensure fault tolerance,taking into account the high performance and reliability of SSDs.In this scenario,I/O contentions is more serious because the updates of the parity block results in more writes.How to alleviate I/O contentions to improving flash I/O performance is a key issue for flash array storage systems.This thesis mainly studies the I/O performance optimization problem in RAID-5-based fault-tolerant flash array storage system,including mitigating the read/write I/O contentions on the device layer of flash array and alleviating the GC/write I/O contentions on the control layer of flash array,and designing scaling method via aggregating scaling/applicaion I/O on the application layer of flash array.The main research contents and contributions are as follows:1)Optimize read I/O performance via alleviating read/write I/O contentions.The I/O contentions problem of the flash array has seriously deteriorated its read performance which results in poor parallelism.In order to alleviate the problem of read/write I/O contentions in the flash chips array and maintain the read performance stability of the SSDs,this thesis proposes a Balanced Redirected Read(BRR)strategy.Unlike the traditional read requests process,the BRR redirects parts of the read requests from the busy chips to the relatively idle chips,and decodes the other data blocks and parity data blocks in the same parity group to get the target data instead of directly reading data on busy chips.However,adjacent chips tend to have similar workload conditions.The traditional data layout method will result in similar chips load conditions in the same parity group,which is not in favour of BRR redirection.To this end,this thesis designs a new RAID-5 data layout method LCI-RAID-5 in the flash array.LCI-RAID-5 assigns adjacent chips to different parity groups,so that the load on different chips in the same parity group is as different as possible.In this condition,BRR has more opportunities for redirecting read.Simulation results show that BRR can significantly reduce the waiting time for read requests.Compared with the recent PIQ algorithm,the BRR can reduce the read I/O latency by 38.4%,with an average reduction of 14.2%under different traces.2)Optimize write I/O performance via mitigating GC/write I/O contentions.During garbage collection(GC),due to the migration of valid pages,the flash chip array must take on a large amount of additional read and write I/O operations,and subsequent erase operations also cost lots of time.This results in long waiting time for requests that access the same chips.To solve this problem,we propose a Deferring Garbage Collection(DGC)solution to improve I/O performance.The DGC first predicts whether the GC will be triggered on the chip by monitoring the number of pending write requests and available free pages,and then redirects the write requests in the waiting queues from "busy" chip to other "idle" chips.Thereby DGC delays the GC on the busy chip and reduce the contentions between GC and write requests.During the idle time of the flash array,the deferred GC is performed in the background to reduce the impact on subsequent requests of the array.We implement DGC on a trace-driven simulator.Compared to the traditional solution using chip-level RAID-5,DGC can reduce the average response time by 5.8%to 46.7%and the 99-th percentile response time by 25.3%to 77.6%under different workloads.3)Design scaling method of flash array via aggregating I/O.Scaling operations are often performed in flash array systems to meet increasing storage capacity and I/O performance requirements.The existing scaling method may need to migrate almost all the data blocks in the system,or need to recalculate all or part of the parity blocks during the scaling period,which greatly increases the scaling time,I/O and computational costs.To solve this problem,this thesis proposes a RAID-5 scaling scheme called Intra-Stripe Migration(ISM).With ISM,data migration occurs only within the stripe,which means that the encoding relationship between the blocks remains the same.Therefore,there is no need to recalculate the parity block during data migration,which greatly reduces I/O and computational costs.The characteristics of the ISM method can be summarized as follows.(1)It requires a minimum amount of data blocks for migration.(2)It does not need to recalculate the parity block during the migration process,and uses aggregated I/O for data migration.(3)It supports multiple continuous scaling operations while maintaining the above attributes.The simulation results show that:(1)In the offline scaling experiment,compared with GSR,ISM reduces the scaling time by 47.91%to 87.01%;compared with ALV,ISM reduces the scaling time by 88.94%to 96.58%.(2)In the online scaling experiment of the real server traces,the ISM reduces the scaling time by 64.15%to 87.30%compared with the GSR.Compared with the ALV,the ISM reduces the scaling time by 92.38%to 95.98%.(3)After scaling,ISM has almost the same data access performance compared to the optimal data layout.
Keywords/Search Tags:Flash Array, I/O Contentions, Garbage Collection, Load Balance, Data Migration
PDF Full Text Request
Related items