Font Size: a A A

A Kind Of Cloud Storage Namespace Architecture Research And Design

Posted on:2015-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:N ZhaoFull Text:PDF
GTID:2298330467485550Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of the Internet leads IT field into the times of big data. Under such a background, massive data is growing fast. Whether individual or enterprise, the demand for high-quality and low-cost mass storage space is growing. The namespace architecture design of a cloud storage system directly determines the expansion of capacity, storage space, quality of service, management costs and so on. Existing cloud storage products commonly through build centralized large data centers to achieve the "cloud" concept. But with the amount of data fast-growing these products will also be faced with increased management complexity, restricted expansion, costs too high and the quality of service decline and other issues.In this paper, with study and research of the current namespace for all types of classic large-scale data storage systems in-depth, we found that the system easy to manage is commonly using physical or logical system master structure, but facing the widespread problem of limited scalability, while the system theoretically with good scalability is not practical because of the high maintenance costs. In this paper, on the weight of the advantages and disadvantages of various schemes, we designed a cloud storage architecture providing fast storage and document retrieval services with freely telescopic, unlimited expansion.In order to fully integrate storage resources on the Internet with lower cost, to get the unlimited expansion of storage space, we use IPv6address prefix automatic clustering on proprietary file management sub-layer in this paper. Then a larger number of widely distributed storage units are polymerized to separate subdomain, maintained through consistent hashing. With files stored in the nearest subdomain, the system can get smaller network latency; Each individual sub-domain file storage parallel execution can effectively improve the system throughput. In order to provide efficient file index for massive private document management sub-layer, we design a distributed multi-dimensional index LSH-KD Forest in shared file management sub-layer. In the index, first the similarity of massive data filename is classified into Bucket based on address sensitive hashing LSH. Then Bucket files are further divided by building multidimensional index with file attributes. By dividing these two levels of data most irrelevant documents are filtered out in file retrieving, dramatically slowing down the convergence file space, and enhancing document retrieval efficiency. Also, the use of multi-dimensional search tree also allows shared document management sub-layer to support a variety of high-dimensional mass data retrieval methods. After algorithm analysis and experimental verification, it is proved that this design of intensity distribution namespace structure has certain advantages and practicality.
Keywords/Search Tags:Cloud Storage, IPv6, Consistent Hashing, LSH, K-D, High Scalability
PDF Full Text Request
Related items