Font Size: a A A

Research On The Key Techniques For Parallel File Storage System

Posted on:2013-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:B HuangFull Text:PDF
GTID:2248330374975325Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the computer technology and information technology continues to evolve, hugeamounts of data also continue to emerge at the same time. Stand-alone storage technology isnot able to handle such data though a very large development in the past decades.Thus themost important issue is to construct a data storage system with high performance, highcapacity, high reliability and high scalability.Distributed parallel file storage system is a hot research topic in computer academia andthe business community,and many achievements have been made.But these achievements aregenerally designed in order to meet the research institutions’ needs.Limits and shortcomingsare inevitable. Thus the current research is not mature, there can be studied and improved. Therelated work to do are following:This paper first compared and analyzed some popular Distributed parallel file storagesystem(GFS, Global File System, etc.). Summarized their advantages and disadvantages andpresented a new architecture. On this basis,this paper studied and discussed three keytechnical problems in distributed parallel file system: construction and management of theindex data, data management in the storage node, load balancing strategy.For the first problem, this paper first compared and analyzed tree file structure andpresented another: flat file structure. On this basis,an index structure based on hash tables andcorresponding hash function are designed. In addition, this paper presented an extensionmechanism based on the consistent hash algorithm in order to improve the scalability.For the second problem, this paper firstly proved that Linux will loss its performancewhen storing large amount of files by analyzing the principle and details of the Linux filesystem. On this basis, this paper presented a data storage solutions using merge mechanismand a detailed description of data structure in disk and memory. In addition,a concurrencystrategy is designed.For the last problem, this paper firstly analyzed two reasons that lead to system loadimbalance: load imbalance of client requests and hot data.For the previous one, this paperpresented a load balancing strategy based on server load model and server staticperformance.For the other one, this paper presented a replica management strategy based onheat statistics to increase the replication dynamically.
Keywords/Search Tags:Huge amounts of data, Distributed parallel file storage system, Consistent hash, Merge, Load Balancing
PDF Full Text Request
Related items