Font Size: a A A

The Research And Implementation Of Full-Domain Anonymization Algorithm Based On Cloud Platform

Posted on:2016-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q HuFull Text:PDF
GTID:2298330467976493Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
For the purpose of knowledge discovery, decision support, information sharing, and scientific research, data owners usually need to release data. Although data owners may have removed personal identity information before releasing data, an attacker can still link the released data with the data from other sources to identify relevant individuals. In order to decrease the possibility of privacy disclosure, the data owners need to apply a data anonymization procedure.In order to determine whether the data meet the requirements of privacy protection, some scholars put forward the concept of privacy protection model. Only when the data meet the basic requirements of privacy protection model, we consider that the data are safe when released. K-anonymity model is widely accepted and researched at present. It mainly uses generalization technology to achieve data anonymization. Full-domain anonymization scheme is one of the typical anonymization scheme based on generalization technology. The privacy protection algorithm we researched in this paper enforces the full-domain anonymization scheme.With the rapid development of information technology, the data size continues to increase, and the data are shared more frequently, which greatly increases the likelihood of privacy disclosure. Thus, data privacy protection is particularly urgent. However, the prior privacy protection algorithms enforcing full-domain anonymization all work in standalone mode, and their efficiency is very low. Clearly, the privacy protection algorithms in stand-alone mode are not applicable when dealing with big data, which is a challenge that is not well-addressed yet.In order to address this challenge, this paper employs the MapReduce distributed programming model, a cloud computing technology, to handle the anonymization of big data. Cloud computing technology enables us to better utilize the computing resources to handle explosively growing data, in particular, it simplifies the distributed parallel anonymization processing of big data. Therefore, this paper proposes a MapReduce based algorithm, MRFDA, to enforce full-domain anonymization scheme, which takes full advantage of cloud computing technology in dealing with the anonymization of big data.
Keywords/Search Tags:privacy protection, data anonymization, big data, cloudcomputing, MapReduce
PDF Full Text Request
Related items