Font Size: a A A

The Design And Implementation Of Metadata Management Subsystem In Large-scale Distributed File System

Posted on:2014-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:W H ZhanFull Text:PDF
GTID:2268330401965460Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Cloud computing is the product of grid computing and distributed computing. It isa business concept, the development of which has brought immeasurable value tomodern society. Storage systems, as the foundation of cloud computing has become aresearch focus of the industry at this stage. With the gradual increase of data amount tobe stored, and the improvement of people’s demands on data security, data sharing anddata management, traditional forms of data storage cannot afford. Especially in recentyears, the rise of cloud computing applications, like high-performance computing, openstorage service and massive data processing have proposed stringent requirement tostorage system on performance, reliability and scalability.Metadata management is an important part of the distributed file system, whichdirectly impact on the efficiency, availability and scalability. How to manage themassive metadata in the system is one of the core issues to be addressed of large-scaledistributed file system.This article describes several mainstream distributed file system, and based onthese existing implementations, introduces a large-scale distributed file system namedC-STORE. At the same time, the article describes the overall architecture of the system,especially the design and implementation of its metadata management subsystem.This article presents an overall framework of constructing a large-scale distributedfile system, which using the three-level mapping hash method for resource location.With this method, the system avoids the existence of a central node, which improves theefficiency and scalability of the system; replication and load balancing can also beachieved to enhance the usability of the system. The system implements deduplicationby using the summary of resource for location, which improves the storage utilization.Comprehensive error handling and garbage collection process allow the system toachieve efficient and stable.On the basis of the overall system architecture, this article focuses on the designand implementation of its metadata management subsystem. This subsystem utilizes itslocal file system to manage the metadata, and implement an efficient metadata operation log to achieve the metadata consistency.The metadata management subsystem is programmed on the Linux system throughC++. To utilize the system’s hardware and software resource efficiently, it implementedaccording to the asynchronous event-driven networking programming model.After testing, the large-scale distributed file system C-STORE can fully meet allthe needs of users of public cloud storage services. Its metadata management subsystemhas excellent performance comparing with the mature and widely used PVFS.
Keywords/Search Tags:distributed, file system, metadata, hash, large scale
PDF Full Text Request
Related items