Font Size: a A A

Research On Principal Component Algorithm Of Difference Privacy Based On Covariance Matrix

Posted on:2022-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhangFull Text:PDF
GTID:2518306737953389Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In the era of big data,various industries have accumulated a huge amount of data.If the excessive data is processed incorrectly,it is easy to cause a "dimension disaster".Principal Component Analysis as a standard data analysis and statistical method can project the original high-dimensional data in the dimensional principal component space to obtain low-dimensional data,thereby reducing data dimensions,simplifying the difficulty of data analysis and saving calculation costs.The current large-scale increase in data information hides a lot of private data.If it is not protected,it will most likely cause privacy leakage.Moreover,traditional privacy protection models(k-anonymity,l-diversity,t-cloesness)face the risks of homogeneity attacks and similarity attacks in the privacy protection process,and cannot effectively protect data sets.Differential Privacy model is based on a firm mathematical theoretical foundation and the assumption of maximum background knowledge with provides a provable mathematical model,which can be realized only by adding noise.Therefore,it has become one of the most effective privacy protection mechanisms at the moment.The Principal Component Analysis of Differential Privacy algorithm integrates Differential Privacy into Principal Component Analysis algorithm.On the one hand,it can realize the conversion of high-dimensional data to low-dimensional data.On the other hand,it can realize the privacy protection of original data.The current Principal Component Analysis of Differential Privacy algorithm is mainly implemented by adding noise to all elements of the projection matrix or the covariance matrix,resulting in excessive noise and a sharp decline in the availability of the data set.This paper considers adding Laplace Noise to the covariance matrix to protect the data set differential privacy,and proposes a Principal Component Analysis of Differential Privacy algorithm based on the covariance matrix called CMPDP.This method adds Laplace Noise to the main diagonal of the covariance matrix.Compared with adding noise to all elements or all data of the covariance matrix,this algorithm has smaller noise;and theoretically proves that the CMPDP algorithm obeys strict difference.The mathematical definition of privacy protection;through noise analysis,the amount of noise added by this algorithm is less than that of traditional Laplace,LOP,Wishart,PCA-based-PPDP and other algorithms.Finally,this paper uses the average error MSE and classification accuracy rate as the evaluation indicators.Through compared with the PCA-based-PPDP algorithm on this data set,the mean square error MSE of the CMPDP algorithm is smaller and the classification accuracy is higher.Therefore,theoretical analysis and experimental verification have proved that the CMPDP algorithm proposed in this paper can provide Differential Privacy protection for the released data set,and the added noise is smaller and the data availability is higher.
Keywords/Search Tags:Principal Component Analysis, Differential Privacy, Covariance Matrix, Laplace Noise
PDF Full Text Request
Related items