Font Size: a A A

Research On Data Distribution Strategies For Cloud Storage Based On Data Redundancy

Posted on:2016-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:S D YuanFull Text:PDF
GTID:2348330479953369Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the advent of the big data era, cloud storage has played a key role in the field of massive data processing and mining. To ensure data availability, the traditional cloud storage system use complete copy or erasure code to add redundancy. The two redundancy methods both have their own disadvantages in different systems. In view of this kind of situation, we design a new cloud storage structure and data distribution strategy in accordance to the above two data redundancy methods.Through intensive analysis the difference between complete copy and erasure code, we design a two-level cloud storage architecture, it contains data center and super node. To use the space saving characteristic of erase code, the super node use complete copy to add redundancy; to use the high performance characteristic of the replica, the super node use redundancy method to provide effective data access. In the data center, we raise a data distributed algorithm based on the sorted node, this algorithm can locate the data pieces in the node quickly, speed up the super node to access data, it also advantage to the load balance of the data center. In the super node, the replicates of the data will change when the heat of the file changes. We raise a multiple cycle heat prediction based on history information algorithm, then we discuss when the heat should update, the number of the replications and the garbage collection in the super node.We test the proposed node-ordering algorithm put forwarded in this paper runs 70% faster than the traditional random or sequential data access algorithms, the distribution differentiation of data blocks between nodes is less than 6% which makes the load balancing possible. Besides, comparing the dynamic replica management algorithm with the LRU and LFU algorithms in Optorsim, not only does it improves the performance but also achieves a 50% reduction of the number of replicas and finally saves the storage space.
Keywords/Search Tags:Cloud storage, Data redundancy, Data heat, Data copy, Erasure code
PDF Full Text Request
Related items