Font Size: a A A

Data Mining Algorithm For Privacy Protection Based On Model Fusion

Posted on:2019-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y S LiuFull Text:PDF
GTID:2428330563456749Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet and the advent of the era of big data,the amount of data generated and transmitted in the network has been growing in spurt.Many data often involve user privacy and contain information that users do not want to leak.However,existing data mining solutions can easily lead to the leakage of such information.Therefore,people want to conduct data mining under the conditions of protecting privacy,that is,to protect users' sensitive information while digging out accurate results.Privacy-protected data mining algorithms have for many years been committed to balancing the protection of privacy and the preservation of the accuracy of the relationship between the two,but they tend to miss one another.Different application scenarios often require different privacy protection schemes to be added to data mining algorithms.Moreover,today's data mining scenarios are more complex,and more and more organizations or individuals are more willing to share them together without compromising the security of their sensitive information.Therefore,it is particularly important to protect the data security of all parties and provide accurate and valuable mining results.The privacy protection work of this article is mainly focused on the current popular data mining classification model: Bayesian network.Firstly,after the local structure learning of each participant,the value of mutual information and conditional mutual information of each node is calculated.Then the Laplace noise is added and the sum is calculated based on the safety multi-party calculation.The mathematical expectation is obtained according to the summation result.Then,based on the idea of model fusion,the final Bayesian network structure is constructed by using the obtained expectations as evaluation criteria in global learning.In the fusion structure learning process,only the values of mutual information and conditional mutual information between the variables in the sample set are calculated,and the sample information of the local participants is not directly used.The differential privacy mechanism is used to protect theinformation security by adding noise to the local data to make the sensitive attributes distorted.The secure multi-party computation ensures the safety of the data when summing up the expected values of the mutual information and the conditional mutual information.Structural data security during learning.Finally,a complete Bayesian network is obtained by parameter learning based on the existing data samples.The algorithm of this paper can provide effective parameter learning and prediction results under the precondition of protecting user privacy.The experimental results show that compared with other schemes in the same field,the scheme designed in this paper can be correct under the effective protection of sensitive information of all parties.Classification results.
Keywords/Search Tags:Security Multi-party Computation, differential privacy, model fusion, privacy protection, Bayesian network
PDF Full Text Request
Related items