Font Size: a A A

Research On Retrieval Performance Optimization Techniques For Distributed File System

Posted on:2019-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z L ChenFull Text:PDF
GTID:2428330596460866Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Distributed file systems have a lot of mature commercial solutions for dealing with the storage requirements of massive files,but the problem of massive small files has always been a technical problem in the academic community and industry.In order to solve this problem,some studies have attempted to introduce a distributed file system to solve the problem of massive files.However,most mature distributed file systems are only designed for large file storage requirements in the use scenario design.In the optimization research of distributed file systems for large-scale small file scenarios,most of the optimization goals are to eliminate single-point failures,performance bottlenecks,or resource consumption loads.Most optimization methods are from file storage models,metadata management methods,and systems.There is no single optimization of the architecture and other aspects,and there is no comprehensive optimization technology aimed at improving retrieval performance.In this context,in order to explore a large number of small file scenarios and improve the feasible optimization techniques and methods for the retrieval performance of distributed file systems,this paper has conducted related technology research.The main work and contributions include:(1)Integrate the metadata server architecture,retrieval algorithm,file partitioning strategy,and replica placement strategy to improve the performance of distributed file system retrieval.(2)Propose the concept of metadata server group based on the hybrid metadata management method,design the member management strategy and metadata consistency algorithm in the master-slave architecture metadata server group,and propose specific according to the masterslave structure of the metadata server group.Distributed computing retrieval algorithm.(3)Exploring and verifying that the client reads files from the storage server in segments and concurrently,and file merging in the memory can improve the read performance of the distributed file system.(4)ZRDFS+ is designed and implemented as a prototype distributed file system.The effectiveness of system design and improvement is verified through experiments.The experimental results show that ZRDFS+ can improve the performance of writing files by 36.8% and read files by 22.3% compared with ZRDFS,and can reduce the performance of HDFS write files.17.1%,reading file performance increased by 51.4%.
Keywords/Search Tags:Distributed File System, Lots of Small Files, Retrieval Technology, Metadata Management
PDF Full Text Request
Related items