Font Size: a A A

Research On Quantitative And Qualitative Protection Of Differential Privacy

Posted on:2019-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:X Y BaiFull Text:PDF
GTID:2428330590992289Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Differential privacy(DP)has become one of the most popular privacy protection technologies.It provides mathematically rigorous quantitative control of privacy leakage.In recent years,a great deal of research on DP has been put forward constantly,which mainly aims to protect the data or the model.DP based data publication algorithm transfers raw data into synthetic data with similar characteristics,which protects privacy of raw data.In addition,many studies aim to improve model performance under DP.However,implementing DP requires adding of noise,which leads to loss of data distribution or model performance,which is unacceptable for industrial applications that have strict requirements on data or model.In order to improve model performance under DP protection and to apply DP in industrial systems,we study DP from two perspectives: quantitative mining and qualitative analysis.In terms of quantitative mining,we study DP based decision tree model.Previous work embed DP in decision tree model with one-step data mining computation(two-level subtree is generated for each operation),which aims to protect decision tree model and prevent structure of model from being deduced by users.We embed DP in decision tree model with two or more steps of data mining computation(three-level or deeper subtree is generated for each operation).When subtree space is too large,we use Markov Chain Monte Carlo(MCMC)to simulate the distribution of the subtree space.In terms of qualitative analysis,previous work all study DP from the perspective of quantitative mining,i.e.,focusing on improving DP based model performance.From a new perspective: qualitative analysis,we study the effect of DP data publication algorithm on the attribute relationship of raw dataset.Qualitative analysis focuses on studying data about rank,pattern or important set,etc.,and naturally has a better ability to accommodate noise.We design a DP based qualitative analysis framework that uses two typical qualitative analysis tasks as examples,which assists data buyers understand attribute relationship of raw data more deeply and make better use of the data they buy without leaking privacy.In this paper,the main work and contributions are as follows:· Previous work embed DP in decision tree model with only one-step computation.We propose a new idea and embed DP in decision tree with different depths to improve model performance.· We propose an algorithm that uses exhaustive search and MCMC,which is time-efficient to embed DP in decision tree model with any depths.· Experimental results show that with the increase of embedding depth,the performance of decision tree model increases.Deep combination of DP and decision tree model indeed improves prediction accuracy of the model.· Compared with quantitative mining,qualitative analysis naturally has a greater ability to accommodate noise.We take the first attempt to study DP in qualitative analysis and try to find a way to apply DP in industrial systems.· We propose a DP based qualitative analysis framework to help data buyers to conduct qualitative analysis tasks and to know the corresponding confidences.We use two typical qualitative tasks(two classifiers)as examples to show the application of the framework.· Experimental results on public and private industrial datasets show that making use of the qualitative analysis framework,even though privacy budget ? is very small(e.g.,0.05),the qualitative analysis tasks can be completed with a high confidence support.Qualitative analysis has the potential to achieve the application of DP in industrial systems.Differential privacy is a very effective privacy protection technology.More and more attention has been paid on that.In this paper,we improve the algorithm about combination of DP and decision tree model in quantitative mining.In addition,from a new perspective,qualitative analysis,we study the possibility of DP application in industrial systems.We hope our work can bring new ideas for improving model performance or industrial application for DP and contribute to the development of DP technology.
Keywords/Search Tags:Differential privacy, Decision tree, Exhaustive search, Markov Chain Monte Carlo, Quantitative mining, Qualitative analysis
PDF Full Text Request
Related items