Font Size: a A A

Research On Privacy-preserving Data Publishing Algorithms Based On Different Anonymity Requests

Posted on:2019-12-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:B Y LiFull Text:PDF
GTID:1368330548456760Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the development of technologies of big data and machine learning,the demand for data becomes more heavily all over the trades.The data exchange and sharing between trades gradually turn into more and more important behaviors in the communication of information.However,the shared information contains amounts of privacy information of users.If these data are published or shared without any privacy protection,it is easy to bring on the disclosure of privacy information of users.Consequently,the scholars propose the technique of privacy-preserving data publishing to solve the problem of privacy disclosure during the process of data publishing and sharing.This paper mainly researches on presenting the proper algorithms to provide security protection for privacy information and preserve information utility as much as possible in the data when facing some anonymity requests in the certain publishing scenario.The specific contributions contain three parts as follows:1.We propose a cross-bucket generalization algorithm.Cross-bucket generalization combines generalization and bucketization to separately protect user identity and sensitive attribute that solves the problem of overprotection for identity when using generalization algorithm.As cross-bucket generalization provides independent protections for user identity and sensitive attribute,we present and make cross-bucket generalization comply with(k,l)-anonymity principle to confine the disclosure probabilities of identity and sensitive values in the anonymous data under 1/k and 1/l,respectively,and the parameters k and l can be adjusted according to the actual anonymity requests.In addition,we minimize the size of every equivalence group and bucket and the range of the generalized QI value in each equivalence group as far as possible by using heuristic that further improves the information utility of anonymous data.2.We define the publishing scenario of personalized privacy protection and propose a local anatomy algorithm.In the publishing scenario of personalized privacy protection,users can freely set sensibility for their attribute values in the data table,and the attributes contained in data table can be divided into the types of QI attribute,semisensitive attribute and sensitive attribute according to the varieties of values.Based on the rationale of bucketization,local anatomy divides the tuples in each semi-sensitive attribute and sensitive attribute into buckets who carries the sensitive value that guarantees all the sensitive values safe and preserves all the original QI values.Local anatomy not only preserves excellent information utility but also has great extendibility.It can satisfy different anonymity principles to protect the sensitive values in the data table at the same time according to the different characters of attributes or actual anonymity requests.3.We present a local anatomy generalization algorithm.It combines generalization mechanism based on local anatomy algorithm that can provide both protections for user identiy and sensitive value in the publishing scenario of personalized privacy protection.The basic idea of protection of local anatomy generalization for user identity is that divides all the tuples into subsets according to their QI values,and divides the tuples into equivalence groups in each subset so that the whole data table satisfies the condition of k-anonymity pricinple.We implement two local anatomy generalization algorithms by using multi-dimension division and the heuristic of NCP guidance to respectively achieve the generalization mechanisms.Since the protections for user identity and sensitive value are separate in the local anatomy generalization algorithm,using different generalization mechanisms does not reduce the protective effect of sensitive value.In conclusion,this paper mainly researches on implementing the method of protection for privacy information in the certain anonymity request and publishing scenario that includes providing the separate protections for user identity and sensitive attribute and protecting the user identity and sensitive attribute when allows users to customize their sensitive values.Additionally,compared with the previous anonymity algorithms,the algorithms proposed by this paper reduce the information loss during the anonymity process as far as possible that preserves better information utility.
Keywords/Search Tags:Privacy Protection, Anonymity Principle, k-anonymity, l-diversity, Anonymity Algorithm, Generalization Algorithm, Bucketization Algorithm
PDF Full Text Request
Related items