Font Size: a A A

A New Method Of Linear Query For Differential Privacy Protection

Posted on:2013-07-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:S X ChenFull Text:PDF
GTID:1108330434971227Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Differential privacy is a robust principle for privacy preserving data analysis tasks, and has been successfully applied to a variety of applications. Differing from many earlier privacy preserving principle, differential privacy defines a formal privacy guarantee in a very rigorous manner. All differentially private data analysis techniques should be able to be proved to satisfy differential privacy by mathematics. Roughly speaking, the risk for an individual incurred by participating in a private data publishing is quantitatively controlled by differential privacy, hence no’adversary can infer sensitive information for any participant from published data. To date, differential privacy has been studied by a vast of researchers, and many effective techniques for private data analysis tasks have been proposed. As a new research domain, however, many new problems are still left open. In this paper, we propose three new query processing methods under differential privacy, which significantly improve the efficiency and effectiveness from existing techniques.A simple but effective approach for achieving differential privacy is Laplace mechanism, which conceals personal information by introducing i.i.d. noises into query answers. Despite of its simplicity and wide applications, this mechanism requires that the global sensitivity of the query answer w.r.t. individual participant must be finite and low. For many complex queries, such as aggregation on the result of SQL query, and subgraph counting, the query answer can have extremely large or even unbounded global sensitivity. Such complex queries cannot be tackled by existing mechanisms. We first propose a mechanism that is based on empirical sensitivity, which can handle queries that have complicated relations with the individual participants, and allows unbounded global sensitivity of the query. Moreover, our mechanism can process subgraph counting for any subgraphs and it can achieve node differential privacy, which is impossible in the past.For many linear query sequences, the i.i.d. noises introduced by Laplace mechanism is not optimal. Some mechanisms introduce correlated or dependent noises into query answers, in order to improve query accuracy. When the query sequence is arbitrarily issued from users, however, these mechanisms require computation cost that is exponential in data dimensionality to optimize the distribution of noises. Hence, high dimensional datasets are intractable for these mechanisms. We propose an improved mechanism that is based on the notion of sub-sensitivities. Our mechanism can improve query accuracy by introducing dependent noises into query answers, while its computation cost is only polynomial in data dimensionality. Hence, compared to existing mechanisms, our mechanism provides significantly better efficiency and usability.Lastly, we propose post-processing techniques for integrating all available query answers in order to improve query accuracy by utilizing the underlying redundancy and dependency in queries. Our approaches can also give meaningful estimate answers to new queries when the privacy budget has been used up. An important benefit of the approaches is that for many important types of queries they can avoid explicit construction of histograms on the entire domain of a tuple, thus they can be used for high dimensional data.
Keywords/Search Tags:privacy preservation, differential privacy, Laplace mechanism, global sen-sitivity, local sensitivity, recursive mechanism, smooth sensitivity, empirical sensitivity, sub-sensitivity mechanism, K-norm mechanism, principal component analysis
PDF Full Text Request
Related items