Hybrid Methods For Privacy Preserving Data Sharing Techniques On Data Mining Environments

Posted on:2014-02-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W Lu

Full Text:PDF

GTID:1228330398971251

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Privacy preservation is a very important fruitful field of research and study in the context of data mining. This is the first step of gathering knowledge in unconfined world of information to reach the destination of best privacy preservation. This research project elaborates knowledge discovery in databases, foundations of Privacy Preserving Data Mining, k-anonymity Model for Privacy Preserving Data Sharing and Secure Multiparty Computation Model for Privacy Preserving Data Sharing. Moreover this is for discussing about, introduction of some useful data mining algorithms which tightly bind with privacy preservation, classification of privacy preserving techniques and applications of privacy preserving data mining. Hybrid Methods for Privacy Preserving Data Sharing Techniques open the gates to touch finer points of Hybrid methodologies in privacy preserving data mining environments. Experiments based on the discussions of literature, mainly about data sanitization and k-anonimyzaton done to prove the set of hypothesis mentioned on this report and reported on latter part of the thesis.Primarily on privacy issues in data mining are focused in this thesis, especially when data are shared before mining. Distinctively, some scenarios are considered in which applications of privacy preserving data mining and hybrid techniques require privacy safeguards. Devoting effort on privacy preservation in such scenarios is multifaceted. A person must not only meet privacy requirements but also guarantee valid data mining results. This dilemma indicates the pressing need for rethinking mechanisms to enforce privacy safeguards without losing the benefits of data mining. Mechanisms which are introduced can lead to new privacy control methods to convert a database into a new one in such a way as to preserve the main features of the original database for mining.Privacy preserving data mining has become increasingly popular and continuously evolving field of study. It allows sharing of privacy sensitive data for analysis purposes. Today, organizations have become increasingly unwilling to share their statistics, frequently resulting in individuals either refusing to share their data or providing inaccurate statistics. In recent years, privacy preserving data mining has been extensively studied, for the reason that the widespread proliferation of sensitive information on the Internet. Knowledge is supremacy, nevertheless while inquest of knowledge privacy must be protected.In particular, the problem of transforming a database to be shared into a new one is addressed that conceals private information while preserving the general patterns and trends from the original database. This challenging problem is addressed by proposing a hybrid model and frame work for privacy preserving data mining that ensures that the mining process will not violate privacy up to a certain degree of security. The framework encompasses a family of privacy preserving data transformation methods, a library of algorithms, retrieval facilities to speed up with concealing the privacy of the transformation process, and a set of metrics to evaluate the effectiveness of the proposed algorithms, in terms of information loss, and to quantify how much private information has been disclosed.Advantages of Privacy preservation can be implemented to privacy preserving data sharing by wide range of sources, for instance Healthcare records. Criminal justice investigations and proceedings, Financial institutions and transactions. Biological traits such as genetic material. Residence and geographic records, Ethnicity, Privacy breach. Location-based service and location.In this research, we’ll describe why data mining doesn’t inherently threaten privacy, and we’ll survey two approaches that enable it without revealing sensitive data, models of k-anonymity and secure multiparty computation. Ultimately hybrid model is tested by using hybrid algorithm to prove the level of success for practical use against some matrices and using some data sanitizing methods.The key contributions of this research are(1) The development of advanced inducing k-anonymity model, algorithm and secure multiparty computation model for privacy preserving data sharing. Various k-anonymization methods were researched and the problems were analyzed for k-anonymization with minimal loss of information and prove that the proposed model is more advanced respect to both measures of loss of information and reliability. Then proceed to adapt the approximation algorithm to achieve an approximation ratio with respect to the entropy measure. Ultimately an Algorithm was developed that generates k-anonymous decision trees. This algorithm is35%faster than existing algorithms. Secure Multiparty Computation model has been developed illustrates homomorphic encryption as basic idea of SMC based privacy preserving data mining techniques.(2) The formulation of Hybrid model and frame work for privacy preserving data sharing on data mining environments. Families of privacy preserving data sharing (PPDS) methods were researched for protecting privacy before data is shared for data mining and clustering. Most important milestone in this approach is development of hybrid frame work with a model which is more efficient and versatile. The model which is introduced here has unique advancements, such as effectiveness and reliability is very high when comparing with other available models.(3) The development of Hybrid algorithm for privacy preserving data sharing. To enforce knowledge protection in data mining, library of algorithms were this proposed. Such algorithms are designed taking into account heuristics for our HMPPDS methods. Hybrid algorithm was developed by amalgamating the best features of sanitization algorithms proposed and k-anonymized decision tree which fits to the said frame work, is superior to all the existing algorithms in this context.Our exploration concludes that privacy preserving data mining is to some extent possible. Empirically and theoretically the practicality and feasibility of achieving privacy preservation in data mining is demonstrated in this thesis. Our experiments reveal that the framework is effective, meets privacy requirements, and guarantees valid data mining results while protecting sensitive information...

Keywords/Search Tags:

Data Mining, Privacy Preservation, Data Mining Algorithms, k-anonymity Model, Secure Multiparty Computation

PDF Full Text Request

Related items

1	Research On Key Technologies Of Privacy-preserving Data Mining On The Cloud
2	To Maintain The Privacy Of Data Mining Research
3	Analysis And Research Of Data Mining And Privacy Protection
4	Secure Data Processing Technology Based On Differential Privacy
5	Research On Privacy-Preserving Data Mining Algorithms
6	Research On Some Key Technologies Of Privacy Preserving Data Mining
7	Study On Privacy Preserving Classification Data Mining
8	Research On Anonymity Models And Algorithms For Privacy-Preservation Data Publishing
9	Distributed Gene Sequence Similarity Calculation Based On Secure Multiparty Computation
10	Outsourcing Computation Of Privacy Preserving K-means Clustering Algorithm Based On Secure Multiparty Computation