Font Size: a A A

Research Of Cloud Storage System Optimization Based On HDFs

Posted on:2017-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:G P YangFull Text:PDF
GTID:2428330596457379Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of cloud computing in recent years has been the focus of attention at home and abroad,its way of life and business models are brought about a profound change.With the development of cloud computing technology,the cloud storage system has also been developed accordingly.In many open source cloud platform,Hadoop cloud platform with high efficiency,high reliability,high scalability,etc.,has been widely used in the industry.HDFS Distributed Storage System(Hadoop Distributed File System)is Hadoop one of the core modules,with a strong data storage capacity.Therefore,this paper studies the optimization of HDFS in cloud storage system.The main work is as follows:Firstly,on the basis of analyzing and studying the HDFS principle,I propose a dynamic replica storage scheme to improve the access speed and storage efficiency of the HDFS and optimize the ability of the system load balance.Secondly,by analyzing and studying the native HDFS and HDFS Federation system architecture,I design a system architecture based on frequency access,according to the fairness of file access,I desgin the frequency threshold and formulate a new access rule.By creating a hotspot cache service,the cache service will directly send the file metadata to the client when users access the hotspot data,in order to achieve the purpose what reduce the pressure on the master node and improve the efficiency of the visit.Thirdly,I design the clustering network structure,the cluster platform is built by virtualization technology,and I use the ganglia distributed cluster monitoring tool to test make a compare between the unimproved HDFS and improved HDFS.This paper analyzes the concurrent access time of the the file accessing model and the load of the NameNode node.Experimental results show that the optimized HDFS has significant performance advantages over the access time of the hot file,and the load average of the NameNode also has a certain descent and the CPU utilization of the NameNode has a certain improvement.
Keywords/Search Tags:Cloud Storage, Load Balancing, Hot Data, Replication Strategy
PDF Full Text Request
Related items