Font Size: a A A

Research And Improvement Of The Massive Distributed File System Based On HDFS

Posted on:2013-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:D H FuFull Text:PDF
GTID:2248330374499343Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the the development of the technology and the rapid growth of imformation, the storage of information become more and more important. How to store massive data fast and efficient is an attractive and important subject in today’s society.In this paper, google files system and hadoop files system had been analyzed. Becouce of the excellent expansibility and open-source, HDFS has been made a lot of attention. But there are many places to be improved if HDFS is used in the massive distributed files storage.Based on HDFS file system, through studying of the structure and organization of data, detailed procedure of reading and writing, and refer to other excellent distributed file system, we introduce some new mechanisms to HDFS to improve performance.Firstly, this paper improve the HDFS files system’s strucure. Through the small clusters, the upper indexing system and the dual hot-backup system, the distributed file system can effectively store massive data with better scalability in the rapid growth of users and applications.Secondly, througe the raw devices, changing the block size and setting the offset ID, we improve the HDFS files system’s performance. The improved system can be a good storage whether it is a large file or small file such as pictures, videos, documents and voice.Thirdly, the improved system used different cache strategy in the namenode, the datanode and the client. Through the asynchronous reading and writing, the users can be answered more quickly from the applications.Finally, This paper analyzes the reading and writing process and design several experiments to compare the performance between HDFS. The result obtained through experiments, we can see that the bottleneck of the namenode has been improved, that the improved HDFS files system be a good storage whether it is a large file or small file and that it is shown a better performance.
Keywords/Search Tags:HDFs, Distributed File System, Massive Data Storage, Structure
PDF Full Text Request
Related items