Font Size: a A A

Data Utility Optimization For Local Differential Privacy

Posted on:2020-10-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z K ZhangFull Text:PDF
GTID:1368330602986075Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The recent proliferation of big data and artificial intelligence have given prominence to the importance of data,which is referred as oil of the digital era by Economist magazine.However,in recent years,governments have proposed stricter privacy protection acts,and Internet users care more about their data privacy.These have enforced Internet companies to develop new technologies to privately collect sensitive data from their users.With the promotion of academia and industry,Local Differential Privacy(LDP)has been the golden standard for private data collection,and been deployed by many Internet giants such as Google,Apple and Microsoft.The main idea of LDP is to perturb the raw data locally to enforce privacy,and provide strict mathematical definition.However,data perturbation will inevitably impact the data utility,and how to improve data utility has been the core for its widely deployment.There are two dimensions to improve data utility for LDP:aggregation algorithm optimization and privacy budget optimization.Aggregation algorithm optimization improves data utility by designing more efficient encoding algorithm to compress data;privacy budget optimization further optimizes the privacy-preserving level to alleviate the impact of perturbation when the aggregation algorithm is fixed.Based on the relationship between data owner and data consumer,the privacy budget optimization can be classified into two methods:incentive mechanism design and collaborative optimization.When data owner is not data consumer,one can induce data owner to adopt higher privacy budget by compensating their privacy loss;when data owner is also data consumer,one can collaboratively optimize data owner's privacy loss and data quality to decide the optimal privacy budget.Recent studies have seen many progress in data utility optimization for LDP,but still exist some drawbacks:a)Data utility of existing aggregation algorithms for high-dimensional data is very low,thus cannot meet the demand for high-dimensional data analysis;b)Existing incentive mechanisms cannot resolve the information asymmetry problem between fusion center and data owners,and cannot deal with the real-time aggregation applications;c)Collaborative optimization is the key for privacy budget optimization when data owner is also data consumer,while related works is absent.Based on the state-of-the-art,this thesis proposes some mechanisms to improve the drawbacks,including1.Study high-dimensional data aggregation algorithm with high data utility.Marginal table is the work horse of high-dimensional data analysis.Thus,we take marginal release as an study object and explore the aggregation algorithm optimization strategy for high-dimensional data analysis.This thesis propose CALM to utilize a set of carefully chosen marginals,which we call views,to capture the correlation of all high-dimensional attributes.Then,all the other marginals can be reconstructed by using consistent views and maximum entropy optimization.The novelty of CALM is that,we propose a practical algorithm to choose an optimal set of views by analyzing multiple error sources.This has significantly alleviate the impact of perturbation.Further,CALM can deal with non-binary attributes with too many attributes,and improve the performance of the state-of-the-art by 1 to 2 orders of magnitude2.Study static incentive based privacy budget optimization.The main idea is to induce users to adopt higher privacy budget by compensating their privacy loss,thus improv-ing the data utility.The privacy loss is determined by both privacy budget and privacy preference which varies among users.For example,women care more about their age than men,patients care more about their location than healthy people.In incentive mechanism design,fusion center always do not know users exact privacy preferences,leading to information asymmetry problem.This thesis design REAP to solve the in-formation asymmetry problem by resorting to Contract Theory.Specifically,assume that fusion center knows the distribution of users' privacy preferences and intend to design a contract for each type of users,each contract consists of a tuple of privacy budget and the corresponding compensation.The fusion center broadcasts all the con-tracts to all users,and each user can choose one contract that optimize his utility.The challenge lies in how to ensure all users to truthfully reveal their privacy preferences REAP deal with this problem through solving an optimization problem with incentive compatibility constraints.3.Study dynamic incentive based privacy budget optimization.Real-time data aggrega-tion holds a wide spectrum of crowdsensing applications.For example,public health monitoring organizations can periodically collect physical index data from users to monitor and control the spread of disease,thus requires users' long-term participation.Previous studies on static incentive cannot fullfill this requirement,since they may cause some users unselected for a long time and then quit.To guarantee long-term participation in real-time data aggregation,this thesis design LEPA that uses on-line algorithm to jointly optimize the system utility in each time slot,and guarantees all users to be selected with certain probability.4.Study the collaborative optimization for privacy budget and data utility.Collaborative optimization method is designed for the scenario where data owner is also the data con-sumer,this thesis investigates one typical application,i.e.,location privacy protection and spectrum allocation in database driven cognitive radio.Database driven cognitive radio is an effective method for solving the interference between primary users and secondary users.However,this technology requires primary users and secondary users to provide their location information directly or indirectly for dynamic spectrum allo-cation.This thesis designs a privacy-preserving utility maximization database query protocol UMax.By collaborativcly optimizing location privacy and spectrum utiliza-tion,UMax allows two parties to choose their optimal privacy budget to optimize the data utility,thus improving the spectrum utilization.
Keywords/Search Tags:Local differential privacy, aggregation algorithm optimization, incentive mechanism design, collaborative optimization, crowd sensing system
PDF Full Text Request
Related items