Research On Cache Technique Of Parallel File System

Posted on:2005-09-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Z Lin

Full Text:PDF

GTID:2168360152969181

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

As the I/O subsystem of cluster server, Parallel file system realizes the globally accessing of all files, devices and network resources distributed on the nodes of cluster. It is necessary to provider a high performance parallel file system to cluster server. Making full use of the distance among the memory speed, network bandwidth, and disk bandwidth, building caches in the system is one of the major methods to enhance the file system performance. PVFS (Parallel Virtual File System) is an open source parallel file system, which has poor file access performance and poor scalability. Aimed at these problems, HPVFS (High performance Parallel Virtual File System) implements multiple high scalability, high availability, and high I/O throughput rate strategies on basis of PVFS. Cache system is one of these strategies, which realize rapid access of metadata and data. The cache system of HPVFS consists of metadata cache, compute nodes cache and storage nodes cache. With strategy of simplifying metadata access protocol, the metadata cache reduces one communication with metadata server once the cache hits. Because the metadata can be timely written back to the metadata server, and the metadata exception can be quickly checked out and corrected, the metadata cache eliminates the consideration of metadata consistence.The compute node cache is implemented using shared memory. A daemon which is in charge of applying and administrating the shared memory cache space begins to run when the cache system startup. The application processes attaches the shared memory to its own memory space, then they can access the cache directly. The shared memory cache reduces the times of copying data between tow processes, and realizes data truly shared among the processes in a node. The design of storage nodes cache pays attention to a serious problem: a storage node cache single missed may break the parallel data accessing of the system, and degrade the efficiency of the others caches. The Coordinated Multi-Queue (CMQ) Algorithm relieves this problem efficiently. CMQ algorithm classifies the blocks which often be accessed together as an access group, and replaces the blocks of an access group in bulk. CMQ algorithm homogenizes the cache hit-ratios of all the storage nodes caches, and improves the global cache hit-ratios of the cluster file system. The test results indicate that the metadata cache can raise the metadata operation speed for above 100%, and the computing nodes cache can enhance the system performance of parallel read and parallel write for more than 30%. The simulation results show that the hit-ratios promotion of CMQ algorithm can come up to 125%, compared with traditional algorithms such as LFU and LRU etc.

Keywords/Search Tags:

Parallel file system, Cache, Metadata cache, Compute nodes cache, Storage nodes cache

PDF Full Text Request

Related items

1	Study On Cache Partition Optimization Based On Non-stacked Cache Replacement Algorithm
2	Design And Implementation Of Distributed Cache Management System For In-memory Columnar Database
3	File Cache Value-Aware Cache Strategy In Edge Network
4	Application Research Of Data Cache Technology In MIS
5	An adaptive chip multiprocessor cache hierarchy
6	Smart Directory Cache For Multi-Many-Core Systems
7	Analysis And Research Of Accelerating System Based On Data Stream Cache Mechanism
8	Classification-based Prefetch-Aware Cache Partition Mechanism
9	Research And Implementation Of Cache Technology Based On WWW
10	Design And Implementation Of L1D Cache For X-DSP