Font Size: a A A

Research On Hadoop/Mapreduce-based Scalable Storage System

Posted on:2013-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:J X WangFull Text:PDF
GTID:2248330392957820Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, data size increased in exponentially, how tostore and compute these data is a challenging problem. Hadoop allows that users can dothe distributed application development by fully use the mass storage and high-speedcomputing cluster, although they are not familiar with the distributed system.The mostfamous of Hadoop is MapReduce distributed computing framework and distributed filesystem HDFS. Its main features are: low cost, good scalability, high efficiency, excellentreliability. It’s the system which can run on multiple operating systems and commercialhardware.However, HDFS was originally designed to store large files. When it faced to somespecific applications, the applications would generate a lot of small files. the increase ofsmall files would cause the file storage speed too slow and the system load in a sharpincrease. In this paper, we propose a new architecture (HMRF) to solve this problem,using the idea of merging the small file into large file to optimize the storage of smallfiles.The experiments show that, the scalable storage system architecture (HMRF) whichbased on Hadoop/MapReduce can store massive amounts of data efficientally. Based onHMRF, the memory of the namenode in HMRF reduced63.2%, the datanode memoryreduced38.7%, and the write speed increased171%.
Keywords/Search Tags:Cloud storage, HDFS, Map-Reduce, small files
PDF Full Text Request
Related items