Font Size: a A A

Design And Implementation Of Cloud Storage System Based On Hadoop

Posted on:2021-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:W ChengFull Text:PDF
GTID:2438330602997660Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the development of social informatization,the use of the Internet has become very popular.Global big data has entered a stage of rapid development,and the resulting data has grown exponentially.How to store and analyze these massive data has become a current hot issue.As a service,cloud storage is widely used in multiple storage areas,with its good scalability,reliability,and stability as an excellent solution for mass data storage.As the mainstream project of distributed storage technology for cloud storage,Hadoop can run on low-cost hardware and has reliable fault tolerance.It is being favored by many enterprises and scientific research.This article systematically describes the design and implementation of a cloud storage system based on Hadoop architecture.In order to solve the problem of small file storage,the original HDFS system was reformed,and the HPM solution was proposed.The design of multiple functional modules was completed at the data processing layer.According to the characteristics of uneven size of small files,based on the design of the optimal small file merging algorithm,the small files are merged so that they are evenly distributed in the data block,and the volume of the data block is fully used to reduce the blank area of ??the data block.To a certain extent,reduce the memory overhead of Name Node,which is reduced by nearly 95.58% compared to the native HDFS.In addition,file indexing and hotspot caching are designed based on Ehcache's cache prefetching scheme.Before the data is written,the file is indexed,and the multi-tags of the file are concatenated into a string as a Row Key and stored in the HBase database.Then,the reading method is designed for various file identifiers,and the Ehcache cache strategy is used to achieve the pre-processing of hot data.Fetching and caching,thereby improving the reading efficiency of the Hadoop cluster,a comparative experiment verifies that the reading rate of this solution is about 2.01 times higher than that of the native HDFS.Through the analysis of the requirements and feasibility of the cloud storage solution,the overall architecture of the cloud storage system is designed and thetechnical architecture of the system,web-side load balancing and database design are carried out on this basis.Finally,the system environment deployment and system The realization of the function mainly verifies the access characteristics of the system based on the B / S mode,and realizes the operations of user management,directory management,file uploading,downloading,sharing,and deleting.Basically realized the functions and characteristics of the cloud storage system.
Keywords/Search Tags:Cloud storage, Hadoop, Small files, HDFS, Prefetch cache, B/S
PDF Full Text Request
Related items