Font Size: a A A

The Key Technologies Of Privacy Protection In Transactional Data Publishing

Posted on:2021-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2428330629988921Subject:Engineering
Abstract/Summary:PDF Full Text Request
The large amount of data generated by various APPs has created the rapid development in the era of big data.In order to find the hidden value,we need to find it through data mining after data publishing,which increases the risk of personal privacy disclosure.Therefore,it is necessary to protect the privacy of individuals before data release.As one of these data types,transactional data has some difficulties in privacy protection due to its sparse and multidimensional characteristics.Therefore,this paper studies the key technologies of privacy protection for transactional data publishing,explores the anonymity model and differential privacy model,and solves the query inconsistency problem in differential privacy algorithm.The transactional data publication by privacy protection technology should not only ensure the availability of data,but also ensure the privacy security of individual data.The main work of this paper is as follows:(1)On the basis of existing research on privacy protection of transactional data publishing based on anonymity model,a privacy protection algorithm named(c,k)-anonymity is proposed to clearly define sensitive and non-sensitive items in transactional data table.First,we generalize each item of the transaction record,generate the item record generalization table,and construct the count tree.Then we set the threshold c according to the privacy requirements.By the count tree,the sensitive items and the non-sensitive items are distinguished clearly,so that the non-sensitive items satisfy k-anonymity,at the same time,we ensure that the number of records with the same non-sensitive items in a equivalence class is not less than k,so as to protect the sensitive items of individual data.This algorithm can effectively protect the leakage of sensitive items in published data.Finally,the availability of(c,k)-anonymity algorithm is proved by using transactional data set.(2)On the basis of existing research on privacy protection of transactional data publishing based on the differential privacy model,a differential privacy transactional data publishing algorithm DPTDP based on the k-ary range tree is proposed to improve the query accuracy of a large range.First,the transaction data table is divided into several sets,each set is described by its count value in the sense of division,and according to different range sizes,all count values are mapped to a k-ary range tree as the way of allocating less privacy budget to larger ranges,more privacy budget to smaller ranges.Then the Laplace distribution noise is added to each node value,and export the histogram publishing data from the noisy tree.With transactional datasets,it is proved that the algorithm improves the query accuracy.The algorithm DPTDP realizes the differential privacy protection of transactional data publishing.(3)In the view of the inconsistency of range query in the privacy protection algorithm of transactional data publishing based on differential privacy model,a consistency adjustment algorithm CA is proposed.First,a set of range queries with intersecting subsets is mapped to a full k-ary range tree,where the range values on the nodes of the same layer are not intersected.Then add random noise according Laplace distribution to each node in the tree to get a differential privacy full k-ary range tree.After adjusting the consistency of the tree,the full k-ary range tree satisfying the consistency constraint is obtained.After traversing the adjusted full k-ary range tree,the data satisfying the consistency of differential privacy range query will be obtained.The experimental results show that CA algorithm realizes the query consistency constraint of differential privacy of transactional data publishing.This paper studies the key technologies of privacy protection in transactional data publishing,including anonymity model and differential privacy model.The paper analyzes and improves the existing k ~m-anonymity algorithm and DPAV algorithm in detail,and adjusts the consistency for the inconsistency of range query.Through the proof and experimental analysis:the algorithm proposed in this paper for transactional data publishing is practical.
Keywords/Search Tags:Transactional Data Publishing, Privacy Protection, Anonymity Algorithm, Differential Privacy Algorithm, Consistency Constraint Query
PDF Full Text Request
Related items