Research Of Data Storage And Management On Huatu Online Library System Based On HDFS

Posted on:2014-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:C Yang

Full Text:PDF

GTID:2268330425471032

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As a platform for users to share information, the online library system brings users to efficiency and convenience, However, with the growth of the amount of data, the increase of user usage, the forms and types of resources of library system are augment and varied, exponential growth of mass data resources brought problems to the storage system, and how efficient storage and management of these data become a pressing problem.It is possible to efficiently store and manage these huge amounts of data since the emergence of cloud storage technology. In this article, the cloud computing platform Hadoop, which is very popular currently, was selected as data storage and management platform of the online library system. We use the Hadoop distributed file system (HDFS) to store and manage the document files of online library system. Since HDFS just to solve common challenges of data storage and management, It cann’t be applied in online library system easily, so it must be improved. Documents of online library system is generally the type of word, pdf, txt and the like, these types of files are relatively small, more than90%of the size of these documents range from32KB to20MB. The metadata of every file is stored in internal memory of the metadata management node (NameNode) in HDFS, when it is used to store the vast amounts of small files, it can lead to excessive memory consumption in NameNode, that is to say, HDFS cann’t store any files when the NameNode’s memory is used up. So in this thesis we propose an optimized solution about mergeing small files into a large file, which can effectively reduce the memory loss of NameNode. On the other hand, considering the speed wreck we put forward a data prefetching mechanism, this mechanism includes two levels of cache, through the two levels of cache, we can significantly improve user file reading fluency, and relieve the pressure on NameNode in HDFS.

Keywords/Search Tags:

Cloud Storage, Mass Storage, Hadoop, HDFS, File System

PDF Full Text Request

Related items

1	Design And Implementation Of Massive Audio File Storage System Based On HADOOP
2	Based On The Hadoop Mass File Storage System Analysis And Design
3	The Technical Research Of Optimization Of File Storage In HDFS
4	Study On Storage Mode For Mass Data Storage
5	Research And Realization Of Small Cloud Storage System Based On HDFS
6	Research On Cloud Storage Based On Hadoop Distributed File System
7	The Design And Implementation Of Cloud Storage System Based On Hdfs
8	The Implementation And Optimization Of Cloud Storage System Based On HDFS
9	Research And Design On Hadoop-based Cloud Storage Platform Of New Campus
10	Design And Implementation Of Cloud Security Storage System Based On Hadoop