Font Size: a A A

Shortening The Read Latency Of Ceph Through Improving The Client-side Data Cache And Read Request Scheduling Scheme

Posted on:2018-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:M TangFull Text:PDF
GTID:2348330566951636Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At the era of information,the distributed file system has become main choice for mass data storage with the advantage of high reliability?large capacity and scalability.In the application of distributed file system,read operations hold a high proportion.Additionally,the read operation is more sensitive to delay than write operation.Therefore,reducing read access latency plays a important role on the performance of the system.As for research on data access process of open source Ceph,it is found that the transmission delay of the requested data in the network layer and the service delay of read request in the nodes are dominant during the read request processing.Two design optimization is targetedly performed.Object prefetching is designed according to the principle of the object file striping in Ceph and local principle,which improves hit rate of the client data,and reduces the longer network transmission delay.To improve the accuracy of prefetching,the paper designs the dynamic adjustment algorithm to set the size of the prefetching window which makes the cache hit rate tend to be optimal.The two-level queue is applied to manage the object in client cache which are classified cached data and prefetching data.Read scheduling is used to reduce the consuming time of nodes,which selects target node among the output from data location algorithm.The scheduling optimization algorithm takes into consideration of the influence by two factors: the minimum area on the topological graph of the cluster and the I/O operators in the shared working queue,which can make the read operation of Ceph balanced.Thus,it can reduce the extra delay in the node queue caused by hot spots.Finally,the two optimization schemes are integrated into the Ceph system to build prototype platform.The test results show the two schemes effectively improve the system performance or improve the load balance of each node.Comparing with the original system,the adaptive object prefetching make response time of the read request decline13.73%,and the scheduling algorithm make the load variance of each node reduce 17.6%.
Keywords/Search Tags:The distributed file system, Object prefetching, Adaptive, Read schedule
PDF Full Text Request
Related items