Font Size: a A A

Design And Implementation Of A Distributed File System

Posted on:2016-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:C BaiFull Text:PDF
GTID:2308330473954344Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet Technology, every day will have a lot of structured and unstructured data. Storage devices as infrastructure to store data is one of the important components of computer systems, which is confronting with high costs, low reliability, low utilization, low scalability and expensive maintenance. Storage systems gradually independent from the computer systems to a single module. From early DEC’s File Access Listener to current Google File System, file system has gone approximately through three phases, Network File System, SAN-Shared File System and Object-oriented File System. Network File System focused on network environment for file sharing and solve the interactions problem between the client and the file servers. Shared SAN file system focuses on the storage system for scalability and SAN-shared file system. Object-oriented File System focus on Object Storage, concurrent access and metadata management. From the system architecture point of view there are C/S-based architecture, storage-shared SAN architecture, cluster-based distributed architecture and P2 P symmetrical architecture. Cluster-based distributed File System is widely used in architecture, it is consists of three components: the client, the metadata servers and data servers. Client is responsible for sending the read and write requests, caching metadata and file data. Metadata server is responsible for managing metadata and processing client requests, is a core component of the file system. Data server is responsible for storing file data, to ensure the availability and integrity of data, besides storage and managing contents of files. The benefit of this architecture is the performance and capacity can be expanded at the same time, the system have better scalability.In this paper, we have designed and implemented a distributed file system which is centralized metadata service model and cluster-based distributed file system. Wherein the metadata service is consists of three main components, resource management node, master metadata service node and standby metadata service cluster. Resource node is mainly responsible for all metadata nodes and checking statuses of all metadata nodes by heart-beat network packet. Resource node can recovery metadata service instantly when master metadata crashed, then eliminate single point of failure system to enhance high availability of metadata services. Data server is responsible for storing file data and managing data file which is consists of data blocks. Every data server process is responsible for a disk and all blocks on the disk. In his paper we designed a new file storage format that merge some small files into a block, not only reduced overload of writing operations, but also saved disk space. Even through data server using inexpensive PC, disk resources are valuable. Disk load imbalance result will cause wasting disk space, this paper first have analyzed disk load,then operating load balancing for all disks by Genetic algorithm and balancing disks utilization ratio.Finally, we have test high availability of metadata services, writing operation of small files and disk load balancing of all disks, test results achieve the desired objectives contrast with HDFS file system.
Keywords/Search Tags:distributed, metadata, small files, load balancing
PDF Full Text Request
Related items