Font Size: a A A

Research On Algorithms For Privacy Preserving Based On Association Rule Mining

Posted on:2008-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhongFull Text:PDF
GTID:2178360212990335Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Data mining is the process of extracting or "mining" knowledge from the large scale data. The most of traditional association rule data mining algorithms are manipulated in the local and single database. In recent years, with the development of computer networking, data applied to produce association rule mining often come from different users. Distributed association rule mining has being gradually researched. The current distributed association rule mining needs a center to conduct the data collecting, then run the suitable data mining algorithm. Sometimes users who concern the privacy will not provide the relative data or give the false data, and this will affect the efficiency. Hence, mining data and protecting privacy from being leaked have been becoming one of the most important directions in the application of data mining gradually.We give a survey about the research background of the association rule data mining first and then discuss the algorithms for privacy preserving association rule. We mainly focus on the relationship between random response technique and association rule, and analyze the transformation probability which affects the accuracy of privacy preserving data mining algorithm, and give the expression between the transformation probability and accuracy of the algorithm. We also show that, under the condition that the number of transaction of data set is 10000 and the percentage of selected transactions is no less than 10%, the relative error of this expression, i.e., error produced by the algorithm comparing with the expected error is no more than 6%. Through computations we demonstrate that with the size of data set increasing, the relative error of the expression decreases gradually. Hence, this algorithm can be used in practice.Then, we introduce a data disguised method for privacy preserving association rule mining based on the randomized response techniques. One mining algorithm on the disguised item set is presented and its security and complexity are analyzed. The experiment shows that the rules resulted from the algorithm have less relative error which is less than 5% compared with the original rules.
Keywords/Search Tags:association rule, data mining, randomized response, privacy preserving
PDF Full Text Request
Related items