Design And Realization Of Parallel File Io Based On Hadoop Distributed File System

Posted on:2011-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:S C Jin

Full Text:PDF

GTID:2198330338989832

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of computer networks and its applications, especially since Google proposed Internet-based mass data storage and Map-reduce parallel computing ideas, data storage management based on network and parallel analysis and processing has become the focus of academia and industry. As one of the reference implementation of the idea, Hadoop has been widespread concern.In order to control file parallel IO, the core of Hadoop—Hadoop Distributed File System(HDFS) use lock mechanism, but does not support multiple users read and write in parallel on the same file. So, this paper proposes a parallel file IO model based on Block granularity, and finally experiments to verify the availability of this model.In this paper, the main works are:(1) Related work on Hadoop was deeply analyzed, particularly on Hadoop distributed file system (HDFS), because of the deficiency of Hadoop on multi-user file parallel IO, improvement ideas was taken out in this paper.(2) By analyzing the implementation of Hadoop, A multi-user parallel IO model without mutual exclusion mechanism was proposed for distributed file system, based on the model, under the right condition of reducing the integrity of the data reading, multi-user reading and writing in parallel on the same file was realized.(3) By modifying the source code, we implement the function described in the model designed, and then carry out experiments to verify the function and performance of the model.

Keywords/Search Tags:

Massive data management, Distributed file system, Hadoop, Parrallel file IO

PDF Full Text Request

Related items

1	Design And Realization Of Parallel File Io Based On Hadoop Distributed File System
2	Research And Implementation Of Hadoop Small File Processing Technology
3	Large Space Aggregation Storage Technology Research And Implementation For Massive Small File System
4	Research On Small File Storage Mechanism For Hadoop
5	The Design And Implementation Of Massive Small Files Storage System Based On HDFS
6	Design And Implementation Of Massive Audio File Storage System Based On HADOOP
7	Research And Implement Of Distributed Massive Small File Storage Access Optimization
8	Research And Design Of High Performance Distributed File System For Small File
9	Research Of Distributed Storage Of Massive RDF Data
10	Research On Data Deduplication Technology Based On Hadoop