Font Size: a A A

Differential Privacy Based Principal Component Analysis Algorithm Design

Posted on:2018-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:W X JiangFull Text:PDF
GTID:2428330590477657Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
This paper studies the problem of differentially private principal component analysis.Based on the pure differential privacy area which the literature pays little attention to,we discuss how to properly publish a positive semi-definite symmetric matrix for principal component analysis under the condition of protecting the privacy.This noise-attached matrix should not bring much distortion to the original matrix.The Wishart mechanism proposed by this paper,through selecting a sample from a properly designed Wishart distribution to construct the noise matrix,then finish the perturbation to the sample covariance matrix.This procedure is set before the principal component analysis such that any other procedure can be connected,which implies high flexibility.What's more,many related work can not ensure the positive semidefiniteness of covariance matrix while the matrix generated from Wishart mechanism can hold this property.On the problem of principal component analysis,we analyze the existing Laplace input perturbation and this paper's Wishart mechanism.We prove that both algorithms can preserve differential privacy while the Wishart mechanism has less effect on original matrix.When transferring to other problems,we point out the essential criteria of choosing Laplace or Wishart mechanism and provide the complete computing framework.We also compare our work to a series of existing work.Compared to the differentially private principal component analysis algorithm proposed by [1],our algorithm has the bound of top-k subspace closeness while they can only ensure the usefulness of principal eigenvector.In addition,we have the competing results if we set k = 1 to obtain principal eigenvector guarantee.Compared to the privacy-preserving low rank approximation algorithm proposed by[2],the bound of our algorithm is not related to k and we have less running time.Compared to the noisy power method proposed by [3],we have a better bound.
Keywords/Search Tags:Differential Privacy, Principal Component Analysis, Low Rank Approximation
PDF Full Text Request
Related items