Font Size: a A A

Research And Application Of The Optimization Strategy Of File Storage And Reading Based On HDFS

Posted on:2017-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2348330503992896Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the modern society,the quantity of network data grow rapidly, large data storage technology is booming, the HDFS(Hadoop Distributed File System) System is the most widely used in large data storage processing technology. Because of HDFS storage mechanism, in the face of the massive small file namenode memory bottleneck problems would appear, so the optimization of HDFS file storage and reading strategies of research and application, to solve the mass small file storage and the exploration of the large data processing has important value and practical significance. Massive small file storage platform is with the continuous deepening of information construction and subject is proposed, and modern network resources exist characteristics of large quantity and small file size. The results of this study can contribute to the construction of massive small file storage platform.From the characteristics of the mass of small files and HDFS storage mechanism, i analyzes the strategy HDFS store and read the file, a new file-based and block-based relationship balanced PS merging algorithm is proposed, using the PS core document merging algorithm Construction of the HMM(Hadoop Merging Middleware) intermediate layer, all users who upload files must go through the HMM intermediate layer, can be verified by experimental tests to improve the processing performance of small files. The main work are:(1) On the current domestic and foreign to the status quo of the massive small document processing and massive small file storage platform research of HDFS working mechanism, i study the technology required for massive small file storage platform.(2) Proposed a PS file merge algorithm suitable for mass storage of small files, the algorithm can through the file association and data balance block, small file composition a large file storage in HDFS, merging files information storage to redis, able to use less data block to store the data by the algorithm. In accordance with the algorithm to build a HMM intermediate layer to deal with massive small files, when get data from the HDFS, then use of cache to improve the reading efficiency.(3) I summarize the functional requirements of the users of the massive small file storage platform, using open source Hadoop to deploy platform development environment, according to the characteristics of data resource file has the advantages of small volume, large quantity, non structure. I combined the MySQL relational database and the memory database Redis, built the web mass small file storage platform.
Keywords/Search Tags:HDFS, Small File, File merging, Cloud storage
PDF Full Text Request
Related items