Font Size: a A A

Research On Key Technologies Of Privacy Preserving Data Mining Based On Local Differential Privacy

Posted on:2022-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:E L WuFull Text:PDF
GTID:2518306740982599Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of data sharing and data analysis technology,collecting and mining data in distributed terminal have become a normal state of big data analysis technology.Privacy protection has become a shackle that affects the in-depth development of big data analysis applications because of mutual distrust between data owner and data collector.In recent years,local differential privacy has been paid more and more attention in the field of data mining due to its rigorous mathematical theory.Frequent itemsets mining and clustering are the basic mining methods with wide application scenarios.However,existing methods based on local differential privacy have the problems of large privacy budget and lack of availability.To solve these problems,this thesis studies frequent itemsets mining and Gaussian mixture model clustering based on local differential privacy to reduce privacy budget,increase data accuracy and improve data usability.The main work of the thesis is as follows:(1)The binary coding perturbation scheme used in the existing frequent itemsets mining methods has the problem of coding redundancy and leads to large error.An improved data perturbation scheme is proposed to protect data privacy and improve data availability.Aiming at the problem of large privacy budget of sampling and filling technology,the hidden Markov model is introduced to avoid accessing big candidate sets.A probability graph is constructed to obtain frequent itemsets in order to improve the algorithm efficiency and reduces the usability loss of mining results.(2)Existing local differential privacy k-means is not suitable for non-spherical data distribution,this thesis proposes a Gaussian mixture model clustering method based on local differential privacy.In order to lower privacy budget of traditional data collection methods,a novel data collection method is proposed to improve data availability.This thesis proposes cluster merging mechanism to reduce the impact of initial number of clusters and reduce the communication cost.The effect of privacy protection and quality of clustering results are also improved.Theoretical analyses and experimental results show that the proposed methods can achieve privacy preserving data mining while maintaining good data availability.
Keywords/Search Tags:Privacy Preserving, Local Differential Privacy, Frequent Itemsets Mining, Clustering, Gaussian Mixture Model
PDF Full Text Request
Related items