| Storage systems in High Performance Computing(HPC)environments are facing new challenges,such as random access.Generic system-level file systems,which are usually optimized for sequential access to large files,cannot adapt to new access patterns and suffer from low scalability and low concurrency.Some Burst Buffer file systems use a manual approach to complete data swapping in and out,or an automatic approach,but the performance is limited by the storage backend.To address these problems,this thesis designs and implements a Burst Buffer-based temporary file system(called Ad hoc file system)that is deployed on-demand according to the computational tasks,which acts as a cache for the back-end storage system and transparently provides higher peak bandwidth to the application than the storage back-end.The work made in this thesis consists of two main parts:The first part is the design of metadata management and data management mechanisms for temporary file systems.This thesis implements the file system entirely based on a scalable distributed key-value store.In metadata management,full paths are used as keys to files and directories,and dedicated metadata servers are eliminated by directory partitioning.posix contains many non-essential features and strong consistency requirements,and this thesis proposes a relaxed posix implementation for key-value stores and HPC systems.The data management aggregates the local storage available to all compute nodes as Burst Buffer.this thesis avoids centralized file data management by storing in chunks.Data block indexing is proposed so as to co-locate data with processes,and index caching is proposed to speed up the indexing process.Experiments show that this file system can provide scalable concurrent I/O performance and scalable metadata access performance when process-independent directories are available.The second part is the design of the caching mechanism of the temporary file system.To avoid the consistency maintenance overhead of the cache and the back-end file system,this thesis first proposes two consistency constraints: 1.the cache data is read-only at the storage back-end; 2.the cache data follows the final consistency.Then,based on the above temporary file system and combined with the consistency constraints,a transparent file metadata and data block swap-in and swap-out mechanism is designed,data is cached on demand,and two swap-out methods,manual and automatic,are provided.Finally,file system operations are redesigned and implemented for caching scenarios.Experiments show no significant degradation in metadata and data performance when transparent caching is implemented.The file data concurrent access performance of the temporary file system proposed in this thesis is significantly better than that of the system-level file system and is highly scalable.Unlike other Burst Buffer file systems,this file system makes full use of the node local storage performance and achieves transparent caching without sacrificing performance. |