Research On Anonymization Privacy Protection Techniques To Data Publishing

Posted on:2017-01-12

Degree:Master

Type:Thesis

Country:China

Candidate:J Hu

Full Text:PDF

GTID:2348330533450337

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

As one of the most important ways of resource sharing, data publishing provides a convenient and fast way to publish and share the information. In the process of data publishing, if the published sensitive attributes information does not get any protection, it could be vulnerable to privacy information leakage, which leads to immeasurable loss.In the privacy protection of single sensitive attribute, l-diversity model based on domain generalization technology produces unnecessary loss of information, which leads to the poor usability of anonymity data. In addition, the sensitive attribute values would also be vulnerable to similarity attack and skewness attack. In the privacy protection of multiple sensitive attributes, the multi-dimensional bucket grouping approach is usually used to protect privacy, but the l-diversity grouping principle of composite sensitive attribute is too harsh on the distribution of sensitive attribute values, which leads to the high suppression ratio. Besides that, the approach is only applicable to the situation that the dimension of sensitive attributes is relatively small. If the dimension of sensitive attributes is larger, the additional loss of information and suppression ratio that are produced by the approach will be larger. In view of the above problems, the details of research works and innovations are as follows:1. Aiming at the above problem of the privacy protection of single sensitive attribute, this thesis proposes an l-diversity anonymization privacy protection algorithm based on clustering. The algorithm uses the clustering techniques to generate equivalence class, and performs local generalization to reduce the information loss. Duo to the algorithm cannot prevent similarity attack and skewness attack effectively, this thesis improved the algorithm and proposes an(l, c)-diversity anonymization based on sensitivity grouping constraints. According to the sensitivity, sensitive attribute values are divided into a plurality of sensitive group. By setting constraints for sensitive groups and maximum frequency threshold for sensitive attribute values, the improved algorithm has a better performance for privacy protection.2. Aiming at the above problem of the privacy protection of multiple sensitive attributes, this thesis proposes an(p, l)-anonymization privacy protection model based on correlation division of multiple sensitive attributes. Firstly, according to the size of the correlation calculated by information gain method, multiple sensitive attributes are classified to reduce the dimension. Secondly, according to(p, l)-diversity grouping principle, sensitive attributes are grouped to ensure that published data could prevent skewness attack and reduce the risk of background knowledge attack. Finally, the model is implemented by the clustering techniques. The results show that the additional losses of information and suppression ratio are small and data has higher usability.

Keywords/Search Tags:

data publishing, sensitive attribute, privacy protection, l-diversity

PDF Full Text Request

Related items

1	Research On Anonymization Privacy Protection Techniques To Data Publishing
2	Privacy Preserving Research For Multiple Sensitive Attribute Data Publishing
3	An Anonymous Privacy Protection Method For Multiple Sensitive Attribute Data Publishing
4	Research On Privacy Preserving Data Publishing For Multi-sensitive Attribute Based On Clustering
5	Research On Privacy Protection Methods For Sensitive Attributes In Data Publishing
6	Research And Application Of Personal Privacy Protection In Data Publishing
7	Research On K-anonymous Algorithm Based On The Sensitive Degree Of Privacy
8	Research On Multi-Sensitive Attributes Data Publishing Grouping Method For Privacy Preserving
9	Privacy Preserving Method For Multi Sensitive Attributes Data In Data Publishing
10	Models And Methods For Privacy-Preserving Data Publishing