Font Size: a A A

Research On Cache Technique Of Parallel File System

Posted on:2005-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z LinFull Text:PDF
GTID:2168360152969181Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the I/O subsystem of cluster server, Parallel file system realizes the globally accessing of all files, devices and network resources distributed on the nodes of cluster. It is necessary to provider a high performance parallel file system to cluster server. Making full use of the distance among the memory speed, network bandwidth, and disk bandwidth, building caches in the system is one of the major methods to enhance the file system performance. PVFS (Parallel Virtual File System) is an open source parallel file system, which has poor file access performance and poor scalability. Aimed at these problems, HPVFS (High performance Parallel Virtual File System) implements multiple high scalability, high availability, and high I/O throughput rate strategies on basis of PVFS. Cache system is one of these strategies, which realize rapid access of metadata and data. The cache system of HPVFS consists of metadata cache, compute nodes cache and storage nodes cache. With strategy of simplifying metadata access protocol, the metadata cache reduces one communication with metadata server once the cache hits. Because the metadata can be timely written back to the metadata server, and the metadata exception can be quickly checked out and corrected, the metadata cache eliminates the consideration of metadata consistence.The compute node cache is implemented using shared memory. A daemon which is in charge of applying and administrating the shared memory cache space begins to run when the cache system startup. The application processes attaches the shared memory to its own memory space, then they can access the cache directly. The shared memory cache reduces the times of copying data between tow processes, and realizes data truly shared among the processes in a node. The design of storage nodes cache pays attention to a serious problem: a storage node cache single missed may break the parallel data accessing of the system, and degrade the efficiency of the others caches. The Coordinated Multi-Queue (CMQ) Algorithm relieves this problem efficiently. CMQ algorithm classifies the blocks which often be accessed together as an access group, and replaces the blocks of an access group in bulk. CMQ algorithm homogenizes the cache hit-ratios of all the storage nodes caches, and improves the global cache hit-ratios of the cluster file system. The test results indicate that the metadata cache can raise the metadata operation speed for above 100%, and the computing nodes cache can enhance the system performance of parallel read and parallel write for more than 30%. The simulation results show that the hit-ratios promotion of CMQ algorithm can come up to 125%, compared with traditional algorithms such as LFU and LRU etc.
Keywords/Search Tags:Parallel file system, Cache, Metadata cache, Compute nodes cache, Storage nodes cache
PDF Full Text Request
Related items