Font Size: a A A

Design And Realization Of Parallel File Io Based On Hadoop Distributed File System

Posted on:2011-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:S C JinFull Text:PDF
GTID:2198330338989832Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer networks and its applications, especially since Google proposed Internet-based mass data storage and Map-reduce parallel computing ideas, data storage management based on network and parallel analysis and processing has become the focus of academia and industry. As one of the reference implementation of the idea, Hadoop has been widespread concern.In order to control file parallel IO, the core of Hadoop—Hadoop Distributed File System(HDFS) use lock mechanism, but does not support multiple users read and write in parallel on the same file. So, this paper proposes a parallel file IO model based on Block granularity, and finally experiments to verify the availability of this model.In this paper, the main works are:(1) Related work on Hadoop was deeply analyzed, particularly on Hadoop distributed file system (HDFS), because of the deficiency of Hadoop on multi-user file parallel IO, improvement ideas was taken out in this paper.(2) By analyzing the implementation of Hadoop, A multi-user parallel IO model without mutual exclusion mechanism was proposed for distributed file system, based on the model, under the right condition of reducing the integrity of the data reading, multi-user reading and writing in parallel on the same file was realized.(3) By modifying the source code, we implement the function described in the model designed, and then carry out experiments to verify the function and performance of the model.
Keywords/Search Tags:Massive data management, Distributed file system, Hadoop, Parrallel file IO
PDF Full Text Request
Related items