Font Size: a A A

Big Data Encryption Algorithm Based On Data Deduplication Technology

Posted on:2014-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:M M WangFull Text:PDF
GTID:2268330425459202Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Big data, with typical characteristics as volume, variety, velocity, high value and lowdensity, usually applies to information that can’t be processed or analyzed using traditionalprocesses or tools. In recent years, with the rapid development of cloud computing, Internetof Things and social network technologies, the network data are increasing dramatically.Therefore, it is very difficult to meet the requirements in capacity, performance, storageefficiency and security of big data for traditional encryption technologies and managementmodes. In big data environment, data security and privacy protection of massive users arefacing the hitherto unknown challenges, which cause the common concerns of academia andindustry.To solve the problems of slow encryption speed, weak real-time performance, and lowefficiency in the present big data encryption schemes, based on the detailed analysis of thefour characters and the basic encryption model of big data, this paper explorers a newencryption algorithm for big data based on data deduplication technology. The maincontributions of this thesis are as follows:Firstly, this paper gives an indepth analysis of the basic encryption model and encryptionprinciples of big data. It discusses the encryption principles, advantages and disadvantages offour mainstream big data encryption technology: the encryption technology of big data basedon modern cryptosystem, the encryption technology of big data based on biologicalengineering, the attribute-based encryption technology of big data, and the encryptiontechnology of big data based on parallel and distributed computing.Secondly, it proposed a data deduplication algorithm based on Bloom filter for big data.After the analysis of advantages and disadvantages of the existing encryption schemes, basedon the four characteristics of big data, i.e., volume, variety, and velocity, high value and lowdensity, the data deduplication technology for big data is studied in this paper. Meanwhile, weevaluate four methods used to discover identical portions of data: Whole File Detectionscheme (WFD), Fixed Size Partition scheme (FSP), Content-defined Chunking scheme (CDC)and Sliding Window Chunking scheme (SWC). By comparison of features of each scheme inmatching accuracy, time consumption and space consumption, a new data deduplication algorithm that suit for big data was proposed. In addition to combing the high speed featureof Whole File Detection scheme and low additional storage overhead feature ofContent-defined Chunking scheme, it uses Bloom filter technology, a space-efficientprobabilistic data structure that is used to test whether an element is a member of a set andusually used for similar data detection and dimensionality reduction, to complete thedimensionality reduction of big data. The experimental results show that, the new algorithmcombines the advantages of the above three kinds of technologies, the detection speed issatisfying when ensure a high detection performance.Thirdly, it proposed a big data encryption algorithm based on data deduplicationtechnology. After the data deduplication processing of big data, combing high security andspeed of Elliptic Curve Cryptography and characteristics of calculation speed, parallelism,security degree of five blockcipher modes of operation of the Advanced EncryptionStandard(AES), a big data encryption algorithm based on data deduplication technology wasproposed. According to the evaluation system of encryption algorithm, it evaluated the newalgorithm and presented a comparative analysis of these techniques. Finally, the experimentalresults show that the security of the new algorithm is satisfying and it can effectively improvethe speed of big data encryption and decryption algorithms.
Keywords/Search Tags:Big data security, Data deduplication, Bloom Filter, Advanced EncryptionStandard (AES), Counter (CTR), Elliptic Curve Cryptography (ECC)
PDF Full Text Request
Related items