Nowadays cluster computing, in contrast to monolithic supercomputers, has evolved as the dominant approach for developing high performance computing (HPC) systems. HPC cluster not only provides aggregate computing power, but requires it's storage system to scale to very large sizes and performance without complicating of data sharing and managing. Since traditional storage solutions fail to meet such requirements; researchers are now dedicated on new storage architecture and file system structure.With the vision of object storage architecture, this thesis pictures a shared object-based storage device parallel file system design, which is named SOPFS (Shared Object-based stroage device Paralle File System) accordingly. SOPFS is designed as a cluster storage system with high performance, scalable and high availability. A big picture of SOPFS desigh and some of it's key points are depicted in the thesis. The contributions of this thesis include: A metadata organization and distribution method inside the metadata server cluster named Dynamic Hashing Partition(DHP) and metadata access policy based on DHP. A metadata replication algorithm per file basis which optimizing metadata access efficiency under bursty access pattern occurred frequently in parallel scientific computing environment. A lazy metadata update policy, which utilizing the separation of control path from data path, gives high I/O throughput. A transaction logging mechanism combined with DHP to provide strong consistency of metadata and fast recovery in case of system failure. |