Font Size: a A A

The Research On Big Data’s Distributed Storage And Secure Protection

Posted on:2015-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:S N ZhaoFull Text:PDF
GTID:2268330431956894Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the advent of cloud, big data has attracted more and more attention which is expanding rapidly and infinitely. As time goes on, and the importance of data being more and more aware of, how to store more and more Big data safely and effectively becomes the most important problem to be addressed. Using distributed data storage is an effective way to collect the scattered resources which are not in use. But as it will also increase the insecurity of data stored in this way, how to ensure data security in a distributed computing platform has become an important research topic for us. Meanwhile, research on distributed data and information storage can also be applied to the cloud storage area which is growing in full swing, so it has fairly extensive theoretical research and practical application value.In view of this, based on studies of the security policy in distributed data and information storage, a data access patterns and query mechanism for big data and distributed platform is designed which can also provide data sharing integrity check function.Firstly, combine the architecture and principles of open source Hadoop Distributed File System with symmetric encryption algorithms and public key cryptography, a distributed storage model for data security based on HDFS is raised. Using HDFS and XML format respectively as the storage environment and physical structure of the data file, this model can not only solve the big data storage problems, but also provide data access control (DAC) for the data file. The experimental results analyzed data encryption time and bandwidth performance when upload and download data in a distributed file system.Secondly, to ensure data security, it is necessary to modify the key periodically to provide effective key management. Based on the Chebyshev polynomial and the LKH key tree management, a periodic multicast key method was proposed here. The method is suitable for the key modification of distributed big data platform and performs better on efficiency and safety compared to the original LHK.Thirdly, based on the concepts and platforms of the Bloom Filter algorithm and the Map Reduce programming approach, a combined method is proposed to improve the ciphertext query performance. On the basis of the original Bloom Filter algorithm, here we proposed the idea of a hierarchical Bloom Filter. Further experiments showed that with the use of hierarchical Bloom Filter algorithm, the data error rate can be reduced in a certain extent.Fourthly, a distributed data file storage and query system was designed and implemented which can realize the encryption upload, decryption download, distribution share, key management and ciphertext queriesof the original data. Meanwhile, the system used B/S system architecture with good user interface and easy to operate.
Keywords/Search Tags:distributed storage, cryptograph query, HDFS, Bloom Filter, Map Reduce
PDF Full Text Request
Related items