Font Size: a A A

Research Of Data Storage Management In Learning Resource Repository Based On HDFS

Posted on:2016-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:J Q WuFull Text:PDF
GTID:2308330503978054Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Learning resource repository is a learning resource sharing platform which is built and managed by colleges or the MOOC platforms under the influence of the current education informatization construction and network sharing, with the purpose of facilitating information sharing and expanding colleges’or the MOOCs’resources. With the explosion of mass learning resources, learning resources storage will face new problems, such as the reliability and scalability of massive data storage, the effective storage of small files, and the deduplication of redundant data. Based on analysis concerning the advantages and disadvantages of the current solutions on the problems mentioned above, this paper puts forward a new solution to solve the current problems.In terms of the reliability and scalability of massive data storage, this paper puts forward the storage architecture scheme, which uses HDFS to store learning resources data files and uses HBase to store metadata. Combined with the file compression storage strategy, this paper proposes a hybrid storage strategy based on file access frequency. It uses dynamic HDFS node-adding strategy to realize the scalability of data storage, and cites the Balancer mechanism to solve the load balance of each storage node.As to the storage of mass small files, it is proposed to use the uniqueness of each user name and Append operation of HDFS to realize merger of small files, to reduce the NameNode load of memory space, so as to achieve the goal of the effective storage of small files.With regard to the redundant data deduplication, this paper cites the Counting Bloom Filter algorithm and puts forward the redundant data deduplication technology about learning resources, thereby reducing the frequent I/O operations and improving the efficiency of learning resources data deduplication.Finally, the data storage prototype system is implemented. According to the proposed problem solutions, this paper tests the system performance and analysis test result, which proves that the proposed solutions are effective in addressing the problems mentioned above.
Keywords/Search Tags:Learning Resource Repository, Data Storage, MOOC, HDFS, HBase, Counting Bloom Filter, Deduplication
PDF Full Text Request
Related items