Research On Implicit Privacy Protection Method Based On Clustering Model

Posted on:2015-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:X M Gao

Full Text:PDF

GTID:2308330479489712

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Government agencies and other organizations produce huge amount of data every day. Such mixture of data is also called “data market”. With the strong promotion of the Centralized Data Storage Management and the rapid development of the Internet, the publishing and sharing data is desired. However, when publishing lots of data, the privacy disclosure would be inevitable. So, how to resolve the conflict between privacy disclosure and data quality attracts researchers’ attentions.Traditional generalization hierarchy based privacy-preserving method usually focus on equivalence class or data blocks, which would make attackers hard to refer the identifier or reduce attackers’ posterior knowledge. This type of strategy only considers parts of data, which is called a local method. Thus, the limitation of this strategy is that the global cost is neglected, which ignores the global cost function and neglects the changes towards the model of the original data set. To solve the above problem, this thesis proposed two novel ideas: the novel t-closeness method and the Gaussian mixture model based on attribute, respectively.Firstly, to cope with the problem that the original t-closeness ignored the global cost to publish data during suppression process, we proposed to add a new constraint d. In order to minimize d, the record cost least would be suppressed, so that the global cost would be reduced.Secondly, to bridge the relationship between sensitive attributes and the cluster model of data, we adopted an improved Gaussian mixture model based on private feature selection. To enhance model’s discriminative ability, the original component would be further divided into three parts. To get model parameters, integrated likelihood function would be adopted. Our model could select features directly. To keep certain distance between the cluster model and original one, the weight of sensitive attributes would be limited into a specific range, thus the published data would get a global protection.The results of experiments show the proposed t-closeness method perform better to protect private data. And the novel Gaussian mixture model with privacypreserving has a stronger ability of feature selection.

Keywords/Search Tags:

privacy-preserving, t-closeness, gaussian mixture model, feature selection

PDF Full Text Request

Related items

1	An Enhanced T-Closeness Privacy-preserving Method
2	Research On Key Technologies Of Privacy Preserving Data Mining Based On Local Differential Privacy
3	Feature Selection Algorithm Based On Privacy Preserving
4	Privacy Preserving Feature Selection In Distributed Environment
5	Feature Selection Mechanism For Multimodal Social Media Data With Privacy Protection
6	Research For Privacy-preserving Domain Adaptation
7	Research On Gunshot Detection In Public Places
8	Ensemble Feature Selection Based On Privacy Preserving
9	Research On T-closeness Privacy Protection Model With The Support Of Rough Set And Cluster
10	Feature Selection Based On Differential Privacy