Research On Query Processing Under Differential Privacy In Multiple Scenarios

Posted on:2024-04-23

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y F Wang

Full Text:PDF

GTID:1528306944964299

Subject:Computer Science and Technology

Abstract/Summary:

With the rapid development of cloud computing,big data and artificial intelligence,data,which is of great significance to promote social and economic development and improve people’s lives,has become a fundamental strategic resource and key production factor in the digital era.Query processing is a fundamental task of data analysis and an important mean to obtain value from data.However,since data usually contains a large amount of sensitive information and faces different privacy leakage risks in different scenarios,querying the sensitive data directly can lead to serious privacy leakage problems.Privacy-preserving query processing techniques provide a feasible solution to address the personal privacy leakage problem caused by query processing.Differential Privacy and its variants proposed in recent years have become the industry standard for protecting the privacy of individuals during data query processing.However,existing works fail to address the core scientific problem of balancing the benefits of privacy protection with the costs of privacy protection:privacy-preserving query processing often requires the use of privacy-preserving techniques to perturb,encrypt,and other operations on private information,which can result in a certain degree of data utility loss or extremely high resource cost.To solve the above core scientific problem,this thesis aims to solve the risk of personal privacy information leakage in query processing under three scenarios:data local scenario,data central scenario and federated scenario,and conducts systematic and in-depth research on query processing techniques to satisfy differential privacy:(1)In the data local scenario,a local differentially private prefix-sumbased range query processing method called PRISM is proposed.PRISM first contains a LDP mechanism called Range based Randomized Response(RRR).RRR improves the utility of the prefix sums by collecting the necessary binary range information under LDP.Next,PRISM contains a new type of prefix-sums-based cube called Grained Prefix-Sum(GPS)cube.GPS cube deal with the large domains of attributes by binning the domains of attributes into coarse granularities.Finally,PRISM contains a Data-DEpendent SeLective PreFix-sum CollecTion Strategy(DELFT).DELFT alleviates the curse of dimensionality by collecting the 2-D prefix sums among part of attributes in a data-dependent way.We conduct experiments on both real-world datasets and synthetic datasets.The results confirm the effectiveness of PRISM over existing methods.(2)In the data central scenario,a differentially private double-sided clipping-based multi-dimensional analytic approach called PRIMA is proposed.In PRIMA,we propose a Clipping and Debiasing Sum Query Processing Method(CLASP)which first bounds the sensitivity of sum queries by clipping the data table from both sides and then debiases the data table based on a symmetric sparse vector technology.Moreover,we propose a Hypothesis Testing based Prefix Sum Computing Method(SCOPE)to compute a base prefix-sum cube based on hypothesis testing.By employing the base prefix-sum cube,any remaining cube can be constructed with constant pieces of DP noise aggregated.We conduct experiments on both real-world and synthetic datasets.The results confirm the effectiveness of PRIMA over existing approaches.(3)In the federated scenario,a differentially private federated query processing method called DP-FedQuery is proposed.In particular,DPFedQuery first designs a D-Privacy based Multi-Party Comparison Method(DPC).By adding an appropriate amount of noise that satisfies d-privacy,DPC makes sure the values held by each participant can be compared safely.Then,DP-FedQuery contains a Gradient Sampling based Multi-party Security Smoothing Sensitivity Calculation Method(GSS).By formalizing the problem of calculating smooth sensitivity as an optimization problem,GSS can infer the smoothing sensitivity of a particular federal query operation with low computational overheads by computing the gradient of the weighted local sensitivity corresponding to a finite number of sampling points under ciphertext.Finally,DP-FedQuery contains a Binary Acceleration based Distributed Noise Generation Method(BNG).By decomposes the logarithmic operations in inverse permutation sampling via binary computation,BNG can generate the noise that obeys the Laplace distribution in malicious majority scenario with lower computational overhead.The effectiveness of DP-FedQuery has been verified through extensive experiments on a standard database.

Keywords/Search Tags:

query processing, differential privacy, secure multi-party computation

Related items

1	Research On Some Basic Protocols And Applications Of Secure Multi-party Computation
2	Research On Secure Multi-Party Computation And Its Application
3	Research On Several Secure Multi-Party Computation Problems And Applications
4	Research On Several Secure Query Protocol Of Database
5	Research On Data Privacy Preservation Technologies Using Secure Multi-Party Computation
6	Research On Key Technology Of Secure Two-party Computation And Its Applications
7	Research On Several Application Protocols Of Secure Multi-Party Computation
8	Research On Basic Theory And Crucial Technique Of Efficient Secure Two-party Computation
9	Design And Applications Of Secure Multi-party Computation Protocol Based On Homomorphic Encryption
10	Privacy Protection Method For Vertical Federated Learning Based On Secure Multi-party Computation