Font Size: a A A

Research On Distributed File Systems Based On Non-Volatile Memory

Posted on:2020-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:C L LiuFull Text:PDF
GTID:2428330599453343Subject:engineering
Abstract/Summary:PDF Full Text Request
In recent years,the emerging non-volatile memory(NVM)has the characteristics of byte addressing,no loss of power-down data,and close to DRAM in read-write performance.An important application area of NVM memory is to build an efficient distributed memory file system.How to expand NVM storage capacity and reduce NVM read-write performance overhead is a key issue.Remote direct memory access(RDMA)technology provides the possibility to solve this critical problem.Although existing distributed memory file systems use RDMA network and NVM memory to achieve higher read and write performance,there are still redundant data replication and message communication,and the advantages of emerging hardware cannot be fully utilized.To this end,this paper proposes a distributed memory file system Nebula based on NVM and RDMA.Compared with the existing distributed memory file system,Nebula improves data access performance and space utilization.To this end,the main contributions of this paper are as follows:(1)Client autonomous data I/O mechanism.For the data access request of the client to the remote storage node,the client uses the physical address recorded by the metadata storage node,directly accesses the physical memory of the file data block in the data storage node through RDMA single-side read-write,and avoids the overhead of querying the index of the file data block in the local file system of the data storage node.To optimize data access performance,Nebula uses index prefetching and spatial pre-allocation to store metadata in client cache,which reduces the communication frequency of client requesting metadata storage nodes to query index,thereby further improving data access performance.(2)Polymorphic memory space management mechanism.Existing distributed memory file systems configure the size of file data blocks to a fixed value,which makes it difficult to dynamically adjust the size of file data blocks during the operation of the system and flexibly adapt to the variable file growth mode.The large size of pre-configured file data blocks will reduce the space utilization of NVM,and the small size of configuration file data blocks will increase the index of metadata,increase the software overhead of querying metadata,and because of the consumption of software stack,the read-write performance of small file data blocks will be lower than that of large file data blocks.Therefore,Nebula proposes a polymorphic memory space management mechanism.The system provides idle file data blocks of different sizes from 4KB to 1GB,and dynamically allocates corresponding file data blocks according to the size of the written data.(3)I/O data load balance mechanism.Unbalanced distribution of data between storage nodes will result in excessive access load on some storage nodes,causing multiple clients to seize limited network card resources and reduce read-write performance of clients.In this regard,Nebula filters out storage nodes with higher NVM storage space utilization or more hot file data blocks,and migrates some of them to other storage nodes,so as to balance the space utilization and access load of each storage node and improve the overall performance of the system.This paper implements the Nebula prototype system and tests its reading and writing performance by using the open source testing tool FIO.The experimental results show that the read bandwidth and write bandwidth of Nebula distributed memory file system can reach 5974 MB/s and 5993 MB/s respectively,which can reach more than 95% of the theoretical maximum bandwidth provided by RDMA network card hardware.Compared with the existing distributed memory file system HDFS,Crail and Octopus,the read and write bandwidth is increased by 15%-200%.In terms of space utilization,Nebula distributed memory file system has the highest space utilization rate of 99%,which is much higher than the existing distributed memory file system HDFS,Crail and Octopus.
Keywords/Search Tags:NVM, RDMA, Client autonomous data I/O, Polymorphic memory space management, I/O data load balance
PDF Full Text Request
Related items