Font Size: a A A

Research On Anonymity Models And Algorithms Of Privacy Preserving For Microdata Publishing To Thwarting Similarity Attack

Posted on:2015-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WangFull Text:PDF
GTID:2298330431493438Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There are plenty of data relating to individuals in information area, named microdata. Microdata, such as medical patient data, demographic data, and business data, play an increasingly important role in trend analysis, disease research, and market analysis etc, therefore many organizations are collecting and publishing microdata. However, the microdata contain private information, whose publishing or sharing will threaten individuals’privacy. How to effectively protect individuals’ privacy on publishing microdata has become a hot topic in database and information security area.Nowadays, a large number of research works have been realized on microdata. But, these anonymity models rarely consider the semantic similarity between sensitive values, so they cannot resist semantic similarity attack. Therefore, research on anonymity models and algorithms for microdata resist semantic similarity attack has great significance.In this paper, we concentrate on anonymity models and corresponding algorithms for microdata with similar sensitive attributes. The main contributions are as follows:(1) A (k, ε)-Anonymity Model for Thwarting Categorical Sensitive Attributes Similarity Attack is proposed. Existing sensitive attributes diversity models do not consider the semantic similarity between sensitive values, so they can not thwart similarity attack. To solve the problem, this paper proposes a (k, ε)-anonymity model, that is strong (k, ε)-anonymity model and weak (k, ε)-anonymity model, which requires each equivalence class in anonymous data set satisfy k-anonymity constraint, furthermore any two sensitive values in one equivalence class are not ε-similar. The paper also proposes a (k, ε)-KACA algorithm to implement the (k, ε)-anonymity model. Experimental results show that the anonymous data satisfying (k, ε)-anonymity which increase the ability of anonymity data to thwart the similarity attack by enhancing the diversity constraints of sensitive values. So (k, ε)-anonymity model can protect privacy more effective than l-diversity model.(2) A (l, e)-Diversity Model to Resist Semantic Similarity Attack is proposed. Existing sensitive attributes diversity models do not capture the semantic similarity between sensitive values, so they cannot resist semantic similarity attack. To address the problem, the paper proposes a (l, e)-diversity model which has two constraints in each equivalence class:(1) there are at least l well-represented values;(2) any two sensitive values are not e-similar. Furthermore, the paper designs a liner-complexity maximum bucketization greedy algorithm to implement the model. Experimental results show that the anonymous data satisfied (l, e)-diversity has a higher diversity degree than that satisfied l-diversity, so (l, e)-diversity can protect privacy more effectively than l-diversity.(3)A (l, e, m)-diversity Model to Resist Semantic Similarity Attack with multiple sensitive attributes is proposed. Nowadays, a large number of research works have been realized on microdata with only one sensitive attribute. But, there are lots of mircodata contain multiple sensitive attributes in real life. In general, multiple sensitive attributes models can not thwart similarity attack, too. To solve the problem, this paper proposes a (l, e, m)-diversity model based on the (2). In which, m is the dimension of sensitive attributes. It requires that each equivalence class satisfy (l, e)-diversity constraint in each dimension. The paper also proposes a MSBF algorithm to implement the (l, e, m)-diversity model. Experimental results show that MSBF algorithm have higher diversity than that MBF, MSCF and MMDCF, so (l, e, m)-diversity model can protect privacy more effective, solve the problem of multiple sensitive attributes.
Keywords/Search Tags:privacy preserving, (k,ε)-Anonymity, (l,e)-diversity, (l,e,m)-diversity, multiple sensitive attributes
PDF Full Text Request
Related items