Font Size: a A A

Efficient, searchable, graph-structured file system metadata services

Posted on:2012-04-10Degree:Ph.DType:Dissertation
University:University of California, Santa CruzCandidate:Ames, Alexander KFull Text:PDF
GTID:1458390011955629Subject:Computer Science
Abstract/Summary:
File system metadata management has become a bottleneck for many data-intensive applications that rely on high-performance file systems. Part of the bottleneck is due to the limitations of an almost 50 year old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's high-performance file systems store 7 to 9 orders of magnitude more data, resulting in numbers of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data. Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems.;To address the problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. This dissertation explores the effectiveness of this approach. We present the data model, a query language interface, the design of a prototype metadata store and query processing. Our graph-based logical data model extends the hierarchical model already in use for file systems. Hierarchies are inadequate for the organizational needs of several example domains, but we show that the graph model can support their needs. The query language interface allows for file identification and attribute retrieval, based on graph-oriented search operators instead of relational table oriented joins. The prototype design uses in-memory based data structures within an architecture that uses memory-mapped files for persistent metadata storage.;We use workloads from three example domains to evaluate the prototype based on ingest and query performance. Notably, within one of our workloads, when compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads. Finally, we contrast the static properties and access patterns from these three workloads to explore the effectiveness of our design choices and suggest options for system and hardware configurations.
Keywords/Search Tags:File system metadata, Workloads
Related items