Font Size: a A A

The Research Of Optimizing The Read Performance Of Distributed File System Based On High-speed Network

Posted on:2015-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhaoFull Text:PDF
GTID:2308330452955804Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
HDFS (Hadoop Distributed File System) is a widespread method to store data whenstoring massive data storage and processing large data processing. In order to satisfy bigdata analysis’s requirement which is to support real-time query and second-level responseunder the environment of massive data, high-speed network is widely used in internal datacenter. But, reading performance of the existing distributed file system based on HDFS(Hadoop Distributed File System) cannot increases linearly with the increase of networkspeed. Under the high-speed network, the existing distributed file system cannot make fulluse of the high-speed network to provide perfect services to upper layer applications dueto the limited disk I/O speed.To address the above problem, a memory saving buffer management mechanismcalled MSBM (Memory Saving Buffer Manager) is proposed. MSBM caches the data onthe date server side to make sure that the data are already in the cache in case the next timea client reads data. MSBM designs a prefetching strategy which only prefetches one datablock in order to improve the reading performance by using prefetching strategy and at thesame time try to save memory utilization ratio MSBM can be available to work on thelow-cost distributed cluster. Three buffer management queues are used to manage thecache and cache the replacement of the blocks and provide data reading service. Inaddition, a corresponding load balance is proposed to make the buffer managementmechanism to achieve optimal effect. The distributed file system can use efficient buffermanagement to the massive large files, make full use of InfiniBand network and improvethe reading performance of the entire distributed file system under high-speed networkenvironment.By comparable tests using/without using MSBM, as well as under Gigabit Ethernetand InfiniBand networks environment the results show that reading operation throughputperformance upgrades50%to150%by using MSBM, which means MSBM caneffectively be applied to the distributed file system under speed network environment.
Keywords/Search Tags:Distributed File System, InfiniBand network, Cache, Prefetching, Throughput
PDF Full Text Request
Related items