Font Size: a A A

Research And Application Of Decisiontree Publishing Technology Based On Differential Privacy

Posted on:2018-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2348330536952519Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The sharing of data is becoming more and more convenient,which has led to people's privacy concerns.traditional database protection measure like the database security authentication,access control become useless.Differential privacy defines an extremely strict attack mode,independent of background knowledge,a rigorous and quantitative representation of the privacy disclosure risk is given,and it perform better in the issue of privacy data release.This paper analyzes the problems existing in data publishing,and proposes the decision tree data publishing technology based on differential privacy.The adaptive privacy budget allocation and add arithmetic noise improve privacy protection of data release process.At the same time,the design of a reasonable subdivision scheme to ensure the effectiveness of the release data set.In this paper,it design and implement a new data publishing technology based on privacy protection and decision tree.It mainly introduces the exponential mechanism and the noise mechanism,analyzes the effect of different privacy budget allocation schemes in these two mechanisms.Research on the adaptive allocation of privacy budget.Through calculation of the decision tree,quantitative distribution of privacy budget overcomes the shortcomings of the privacy uniform distribution which is not reasonable,by extending the life cycle of?,improve the privacy protection.Reasonable distribution of privacy budget,can improve the efficiency of the exponentialmechanism.In continuous attributes,the algorithm needs to maintain a large scale subdivision set,which leads to the decrease of the exponential mechanism's efficiency.By calculating the weight of specific continuous attributes,multiplying the the interval length to participate in the optimal scheme selection,overcoming the problem of low efficiency of the exponential mechanism.The generation process of equivalence class will lead to privacy disclosure,which can be realized by adding noise to the disturbing data.The paper further studies the noise technology of asynchronous arithmetic based on the Laplace.It change the noise of a single form into an algebraic form,adding to the equivalence class.overcomes the shortcomings of existing algorithms.The post processing technique is adopted to optimize the decision tree,which reduces the classification error and improves the classification accuracy of the decision tree.On the standard data set,Using the classification accuracy and the level of privacy protection as evaluation indexes,to test related technologies and algorithms.The classification accuracy is more than 80%,which can preserve the original characteristics of the data.It is verified that the proposed decision tree publishing technology based on differential privacy has a high level of privacy protection.The relevant technology is applied to the real commodity transaction data,and with the execution process of the algorithm to analyse the privacy,it proved the validity of the research work.
Keywords/Search Tags:generalization, different privacy, decision tree, data release
PDF Full Text Request
Related items