Font Size: a A A

Research And Implementation Of HDFS File System Application Towards Optical Library

Posted on:2015-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:N C WeiFull Text:PDF
GTID:2308330452457204Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the era of the Internet, All kinds of Distributed file system which is used toprovide a large data storage services have emerged, and HDFS is one of the widely usedopen source distributed file system. However, the really useful heat data in a distributedfile system is only a small part of the data, and Most of the cold data will not be used inthe short term. Thus, the archive of backup of the cold data would become very important.At the same time, since the optical library has the characteristics of long data retentiontime, low cost, relatively better access speed, the optical library is very suitable as along-term data preservation medium. Therefore, the optical library can be used as athird-level storage media to resolve the problem of archiving and backup of cold data indistributed file system.In order to use the optical library on HDFS, studied the structural characteristics ofthe existing optical library system and the optimization strategies which are used improveperformance of files reading and writing in the optical library system firstly, includingcache replacement policy and I/O scheduling strategies; and analyzed the optimizationstrategies of the tape library system which is also used to the data archiving and backup asthe optical library. Then the concrete architecture, communication and the specific processof files reading and writing in the HDFS were studied. Finally, a kind of optical library filemanagement System is implemented on HDFS,and which mainly implemented threeoptimization strategies including small files consolidation strategy, block file cachingstrategies and optical library I/O scheduling policy.Eventually, the simulation tests were being in the HDFS cluster, through the design ofthe test program. The tests include the functional test and the performance tests consistingof memory usage, the reading and writing performance of small files and large files.Through the simulation tests, the System can be found basically realized the features ofdata backup and archiving and the performance of the system have a certain degree ofimprovement by the three optimization strategies.
Keywords/Search Tags:Optical library, Distributed file system, Small files, Cache replacementstrategies, I/O scheduling
PDF Full Text Request
Related items