Font Size: a A A

Research On Key Technologies Of Resources Management In Cloud Storage System

Posted on:2015-07-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:P D YeFull Text:PDF
GTID:1228330467963685Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud storage uses technologies such as cluster application, grid technology and distributed file system to combine different kinds of storage equipments by software and network to work together for providing data storage and access services to users. Its core point is data storage and management. The rising of cloud storage services means that the storage space and network occupation are getting bigger, and data has also turned into basic resources from simple processing object, which increase the resources security risk. Therefore, more and more fast backup and recovery points in time are needed, which causes huge cost for data management. Security storage, network transfer and efficient utilization have become the key problems of recent researches in cloud storage system.The goal of this paper is to enhance efficiency of data resources management. This paper combines deduplication based data compaction technology, network traffic autoregressive based bandwidth ultilization rate improvement policy and discrete dynamics based sensitive data protection together to propose a set of data resources management models and architectures based on cloud storage system, which have great theoretical significance and application value. The main innovations and contributions of this paper are briefly summarized as follows:(1) In allusion to the disk bottleneck problem of chunk fingerprint lookup, an optimized data deduplication algorithm is proposed, which is called Simdedup. It starts from the angle that similar data objects contain more redundancy, and builds hierarchical index for chunk fingerprints. Besides, it executes chunk fingerprint lookup based on similarity detection technology. Experimental results show that Simdedup can find similar data objects accurately and build deduplication index based on these data objects, which will enhance data deduplication performance.(2) In allusion to the redundant metadata problem of data deduplication, a deduplication algorithm based on condensed nearest neighbor rule is proposed, which is called Dedup2. It eliminates metadata with high similarity to get a much smaller subset base on consistent subset. Experimental results show that Dedup2can reduce the size of deduplication metadata more than50%, while maintains similar deduplication ratio, and reduces the cost of storage resources one step further. (3) In allusion to low ultilization ratio problem of network bandwidth due to redundant data transfer, a network data deduplication algorithm based on network traffic autoregressive model is proposed, which is called ARTRE. It splits data objects into several transfer units, and builds prediction model for current network situation. By predicting the network available bandwidth, it self-organizedly adjusts transfer policy to enhance ultilization ratio of network bandwidth. Experimental results show its transfer throughput is7times that of traditional deduplication based transfer scheme. It can utilize network bandwidth sufficiently to achieve higher transfer efficiency, and has good flexibility on network situation.(4)In allusion to data security problem in cloud storage environment, a new data privacy-preserving algorithm based on the principle of discrete dynamics system is proposed, which is called EPPA. It encrypts data based on chaotic map and scrambles and splits data in three-dimensional space, which can prevent users’sensitive information from being leaked to the attacker and the cloud system manager. By mapping, scrambling and splitting data in three-dimensional space, EPPA leverages the initial value sensitivity of chaotic map and the complexity of three-dimensional data structure recovery to ensure the confidentiality of data. Analyses and experiments show that this approach can ensure confidentiality of user data in cloud storage system, and its encryption time is85%faster than AES.
Keywords/Search Tags:cloud storage, data deduplication, similarity detection, deduplication index, data privacy-preserving
PDF Full Text Request
Related items