Research On Identification Of Implicit Privacy Dimension In Hight Dimension Data

Posted on:2015-08-22

Degree:Master

Type:Thesis

Country:China

Candidate:H F Ba

Full Text:PDF

GTID:2308330479989748

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Most of data publishers generally obtain the original raw data, but do not have the ability to perform data mining techniques. Data analysts always, however, suffered from a lack of data. Some data publishers worry that it will lead to privacy disclosure without taking any protective methods. By taking privacy preservation techniques, the twisted data may inevitably affect the later data mining process.In the era of big data, privacy becomes a challenge issue, which attracts a great amount of research efforts. Traditional privacy preserving algorithms focus on preventing sensitive data associated with a specific person via a set of features manually assigned. However, how to determine such set of features is seldom studied. It is urgently needed to be studied since it is impossible to manually find the complete set of features which can deduce data privacy given a huge volume of data.In this paper, we first theoretically study privacy preserving issues and propose the Implicit Privacy Feature Set(IPFS) algorithm to find the complete set of Inferring Privacy Feature Set which is turned by a threshold. The Key Implicit Privacy Feature Set(KIPFS) algorithm is designed to find the key inferring privacy feature set. The key inferring privacy features can be applied to various privacy protection techniques, thus protecting individual’s privacy. To evaluate the effectiveness and the efficacy of the proposed approach, two state-of-the-art algorithms are implemented as baseline algorithms, which are k-anonymity and t-closeness, and their revised versions are applied on the KIPFS for the performance comparison.Experimental results showed that by integrating the KIPFS both algorithms, we can achieve better performance in terms of efficiency and data quality. Furthermore, the impact of privacy protection on data distribution can be also minimized. Therefore, the quality of data mining result can be guaranteed.

Keywords/Search Tags:

privacy preserving, implicit privacy dimension, data mining, data publishing

PDF Full Text Request

Related items

1	Research On Privacy Preserving Publishing Of Big Location Data Based On Differential Privacy
2	Research On Privacy Preserving Methods For Data Mining
3	Privacy-preserving data mining through data publishing and knowledge model sharing
4	Research On Privacy Preserving Big Data Publishing Technology
5	Models And Methods For Privacy-Preserving Data Publishing
6	Research On Several Problems Related To Privacy-preserving Microdata Publishing
7	Research On The Statistical Partition Publishing And Privacy Preserving Method Of Big Location Data
8	Research On Privacy Preserving Technology For Data Publishing
9	A Study On User Data Privacy-Preserving Mechanism With Differential Privacy
10	Robust Data Anonymization Techniques In Privacy-Preserving Data Publishing