Efficient, searchable, graph-structured file system metadata services

Posted on:2012-04-10

Degree:Ph.D

Type:Dissertation

University:University of California, Santa Cruz

Candidate:Ames, Alexander K

Full Text:PDF

GTID:1458390011955629

Subject:Computer Science

Abstract/Summary:

File system metadata management has become a bottleneck for many data-intensive applications that rely on high-performance file systems. Part of the bottleneck is due to the limitations of an almost 50 year old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's high-performance file systems store 7 to 9 orders of magnitude more data, resulting in numbers of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data. Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems.;To address the problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. This dissertation explores the effectiveness of this approach. We present the data model, a query language interface, the design of a prototype metadata store and query processing. Our graph-based logical data model extends the hierarchical model already in use for file systems. Hierarchies are inadequate for the organizational needs of several example domains, but we show that the graph model can support their needs. The query language interface allows for file identification and attribute retrieval, based on graph-oriented search operators instead of relational table oriented joins. The prototype design uses in-memory based data structures within an architecture that uses memory-mapped files for persistent metadata storage.;We use workloads from three example domains to evaluate the prototype based on ingest and query performance. Notably, within one of our workloads, when compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads. Finally, we contrast the static properties and access patterns from these three workloads to explore the effectiveness of our design choices and suggest options for system and hardware configurations.

Keywords/Search Tags:

File system metadata, Workloads

Related items

1	Research On Key Issues In Large-Scale Cluster File Systems
2	Metadata Management For Parallel File Systems
3	Metadata Management Optimization In Distributed File Systems
4	Optimizing energy and performance for server-class file system workloads
5	Parallel Computation For File Metadata Cube In Cloud Computing Environments
6	Study And Implementation On Key Techniques Of Distributed File System
7	Research Of File System Metadata Graph
8	Design And Implementation Of Efficient Metadata Index In Distributed File System
9	Research And Implementation Of A File System Metadata Storage Technology Based On NVM
10	Research On Metadata Management Of Parallel File System Build On Shared Object-based Storage Device