Font Size: a A A

Optimizing privacy-accuracy tradeoff for privacy preserving data mining

Posted on:2011-08-03Degree:M.SType:Thesis
University:University of Maryland, Baltimore CountyCandidate:Kim, DongjinFull Text:PDF
GTID:2448390002455784Subject:Information Technology
Abstract/Summary:
Organizations and companies often need to share data for data mining. However, there are concerns of privacy breaches of the shared data which might have legal and strategic consequences for owners of the data. There has been a rich literature on privacy preserving data mining methods that can protect privacy and at the same time still allows accurate data mining results. Many such methods have some parameters that need to be set correctly to achieve the desired degree of privacy protection and the desired quality of mining results. Currently, there has been little research on how to find the optimal setting of these parameters efficiently.;This thesis studies the parameter tuning problem for the condensation approach, a widely known privacy preserving data mining method. The contributions include: (1) Analyze of the condensation approach and identification of its weakness, (2) A class-wise condensation approach to address the weakness of the condensation approach, (3) A rule-based approach to find the optimal setting of the group size parameter of the condensation approach, (4) experimental evaluation of proposed approaches. The experimental results show that the proposed class-wise condensation approach leads to better mining quality than the original condensation approach, and the rule-based parameter setting approach can find the optimal setting in a cost-effective manner.
Keywords/Search Tags:Data mining, Privacy, Condensation approach, Find the optimal setting
Related items