Font Size: a A A

Based On The Hadoop Mass File Storage System Analysis And Design

Posted on:2016-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z M ZhangFull Text:PDF
GTID:2308330503450629Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present, the wave of the Internet is wavewave, information, intelligence, data,a large number of more and more obvious, all kinds of web portals, e-commerce sites are more and more large, the group, like Tencent, Taobao, Baidu, Sina and other Internet giants to provide extensive services, data storage has been enter the mass model, and with the explosive growth. The vertical expansion of mass storage cost is more and more big, the use of commercial storage enterprises increasingly heavy burden, even has become a bottleneck restricting the development of enterprises, mass file storage system to achieve high capacity, high concurrency is imminent.Through the analysis of the actual needs of the design of the distributed storage system architecture based on Hadoop, the model based on Hadoop HDFS distributed file system to support the underlying file storage, with cheap Linux cluster hardware as the foundation, through the HDFS to achieve the unique high corresponding, high fault tolerance, high concurrency support and cluster data to construct our own mass balance file storage,provide reliable service to the outside.HDFS distributed file system and MapReduce Hadoop parallel programming framework, which provides strong technical support for the design of our large-scale data storage architecture, realize the efficient access to files in high concurrency, high load environment. The cache design, design of load balancing system to improve high concurrency,optimize the file read and write.Mass file storage will bring large file metadata, column database storage file metadata using HBase distributed, high capacity, high efficiency to meet storage requirements, by taking into account the file type, which belongs to the main factors such as design, HBase key file, can be stored in the cluster node physical location close to, reduce disk seek, cross point, cross network addressing, improve the effieiency of file access.To build a Hadoop cluster, the deployment of the application server, for high concurrent pressure experiment, collecting the experimental data, and the experimental data analysis, verify the system architecture can achieve the target.This thesis focuses on the solution of high concurrency, high capacity challenges,to achieve the level of expansion capacity, reduce storage costs, and can provide efficient services. At present, the distributed technology mature file storage andprocessing by using this system, build Hadoop cluster deployment, application server,file server, cache server etc.. Through the analysis of test data, the model’s practical effect, whether the architecture model of the test proposed to support mass file storage and management.
Keywords/Search Tags:Hadoop, mass file, distributed, cloud computing, file storage model
PDF Full Text Request
Related items