Font Size: a A A

Leveraging Hash Algorithms In Distributed Metadata Management And Optimization

Posted on:2020-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiuFull Text:PDF
GTID:2428330623963629Subject:Computer technology
Abstract/Summary:
Current distributed file systems are designed to support PB-scale even EB-scale data storage.Metadata service,which manages file attribute information and global namespace tree,is crucial to the system performance.Distributed metadata management,using multiple metadata servers(MDS's)to store metadata,provides effective approaches to alleviate the workload of a single server.However,maintaining good metadata locality and load balancing among MDS's simultaneously are two contradictory issues.In this thesis,we propose two novel schemes to solve distributed metadata management problems.One is a Locality Preserving Hashing(LPH)based scheme called AngleCut,the other is DeepHash using machine learning techniques to learn a LPH function.AngleCut is a novel and specially designed hashing scheme which uses an LPH function to project the namespace tree into linear keyspace,i.e.,multiple Chord-like rings.The LPH function can preserve the relative position of metadata nodes in the namespace tree,which essentially keeps the metadata locality.AngleCut adopts a history-based allocation strategy to adjust the workload of MDS's dynamically.The metadata cache mechanism is also integrated in AngleCut to improve the query efficiency.DeepHash is a machine learning scheme which leverages neural networks to learn a LPH mapping.DeepHash first converts the metadata nodes to feature vectors according to the structure of namespace tree.We design two novel loss functions,the pair loss and triplet loss,for DeepHash to learn the relative position of metadata nodes.We also propose another dynamic allocation algorithm to balance the workload between MDS's.To the best of our knowledge,we are the first to adopt machine learning to optimize distributed metadata management.To evaluate the performance on metadata locality and load balancing degree of AngleCut and DeepHash,we conduct extensive experiments on several real-world data traces.The experimental results as well as theoretical proofs exhibit the superiority of our schemes over the state-of-the-art schemes.
Keywords/Search Tags:Locality Preserving Hashing, Distributed Metadata Management, Neural Networks
Related items