Font Size: a A A

Research On Hiding Association Rules Based On Relative-non-Sensitive Frequent Itemsets

Posted on:2010-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z J LiuFull Text:PDF
GTID:2178360278960065Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining has emerged as a means of extracting potential knowledge which can't get through the normal database technology (for example: select, statistic) from large quantities of data. Because the potential information is sensitive, data mining is particularly vulnerable to misuse or abuse. Motivated by the multiple requirements of data sharing, privacy preserving and data mining, privacy preserving data mining (PPDM) has become the research hotspot in data mining.First of all, this paper summarizes the privacy preserving data mining from privacy preserving technology. Then, we introduce and analyze some typical privacy preserving association rules algorithms. Finally, responding to the drawback of the existed preserving association rule algorithms, this paper will present a new algorthm about privacy preserving frequent itemset mining—HarRFI.Association rules hiding algorithms often sanitize transactional databases for protecting sensitive information. Methods of sanitization include: data perturbation, which is accomplished by the alteration of an attribute value by a new value; blocking, which is the replacement of an existing attribute value with a'?'; aggregation or merging which is the combination of several values into a coarser category; swapping and sampling. Data perturbation is one of the most important sanitation approaches. However, the existed perturbation methods either focus on hiding sensitive rules, or take measures to reduce the impact on non-sensitive rules from the whole database while hiding sensitive rules. In this paper, we propose a new algorithm HarRFI which finds out which non-sensitive rules will be affected before hiding the sensitive rules. It hides sensitive rules based on the side of non-sensitive rules. In this algorithm, the item to be deleted (victim-item) in the transaction that contained sensitive rule must satisfy: 1, in the sensitive rules; 2, not in the non-sensitive rules. Because the different transactions set contained the same sensitive rule may include different non-sensitive rules, HarRFI selects different victim-items in different transactions, which makes sure that removing the victim-items in the special group of transactions has the least influence on non-sensitive rules. The experiments show that in the transaction data set, compared with other algorithms such as Na?ve, MinFIA, MaxFIA and IGA, HarRFI can achieve better results. In particular, our algorithm has the least impact on non-sensitive rules.
Keywords/Search Tags:Privacy preserving, Data mining, Association rules, Sensitive patterns
PDF Full Text Request
Related items