Font Size: a A A

Research On Data Redundance Strategies In Cloud Storage System

Posted on:2017-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:H S ShiFull Text:PDF
GTID:2428330590968200Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cloud computing is a kind of distributed computing which based on virtualization,load balance,network attached storage and software defined network technologies.Cloud computing divides the large volume data set into small data chunks and dispatches them into different nodes for actual computing.Cloud computing can handle big data processing and manage the software and hardware resource elastically.Hadoop has become the de-facto cloud computing framework.It is designed to scale up from single servers to thousands of machines,each offering local computation and storage.Rather than rely on hardware to deliver high-availability,the library itself is designed to detect and handle failures at the application layer,so delivering a highly-available service on top of a cluster of computers,each of which may be prone to failures.However,the static storage management in Hadoop degrades in performance and squanders the storage cost under data skew.In the meantime,HDFS performs badly in the Map shuffle phase.In this dissertation,we analyse the data flow of Hadoop MapReudce jobs and then propose the dynamic heterogeneous storage system,DHS which dynamically adjusts the data storage and divides the I/O access into different storage mediums.DHS can add or delete the Block replicas with need while using heterogeneous storage to deliever high bandwidth.Our contributions are summarized as follows.First of all,unlike the existing storage improvement schemes,DHS puts forward the concept of access and load prediction by statistical analysis basing on historical access record.Secondary,DHS uses a new storage regulating feedback mechanism to adapt to the needs of the development of storage solutions to access.Finally DHS detailed consideration access streaming to common disk,developed a hybrid storage and ease competition mechanism,improves the overall performance of the disk array.Both simulation and deployment have proved the integrity and efficiency of DHS.
Keywords/Search Tags:Cloud Computing, Data Storage, Reliability, Data Skew, Distributed System
PDF Full Text Request
Related items