Font Size: a A A

The Design And Implementation Of A Distributed Storage System

Posted on:2017-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y PeiFull Text:PDF
GTID:2428330566989580Subject:Engineering
Abstract/Summary:PDF Full Text Request
The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos,videos,and audio clips.These objects must be stored and served with low latency and high throughput by a system that is geo-distributed,highly scalable,and load-balanced.Existing file systems and object stores face several challenges when serving such large objects,such as storing redundant meta information,providing redundant functionality,not resolving load balancing issues,etc..The objective of this paper is to provide storage for mass media objects with variable size and provide low-latency and high-throughput access to these media objects.Based on the existing storage systems,and the analysis of the overall requirements,functional requirements and performance requirements of the distributed storage systems,the paper designs a distributed storage system with high performance,scalability and load balance.The distributed storage system overall architecture consist of cluster management module,front-end routing module,and data node module.The cluster management module is responsible for maintaining the status of the cluster.It stores the hardware distribution information of the entire distributed system and the logical distribution information of the data.The front-end routing module is the gateway to the external request of the distributed storage system.It handles the user request,performs some security checks,and captures the operations in the storage system.Data node module is responsible for the specific data management and storage.The storage system is designed in a decentralized way and leverages techniques such as logical blob grouping,rebalancing mechanisms,and OS caching.So it has a good performance when storing large immutable data,and it also provide various sized files' storage.At the end of this paper,the paper has tested the throughput and delay performance.The test is performed in different data sizes and in write-only,read-only,and read-write modes with different numbers of clients.Through the analysis of the data generated on the change of the number of clients,data size and operation mode,the paper plot the graphs of the throughput and delay of the distributed storage system.It shows the high throughput and low latency,durability,availability and scalability of the distributed storage system under these three dimensions and it is load balanced...
Keywords/Search Tags:Distribution, Storage, Partition, Replica, Load Balance
PDF Full Text Request
Related items