Font Size: a A A

Research On Computing Skyline Over Large Scale De-identification Policies

Posted on:2018-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2428330569975185Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the age of big data,it is important to exchange and share data among different parties.De-identification policies use an abstract description of the data to get the privacy protection.However,the number of de-identification policies is exponentially large due to the broad domain of attributes.Deducing the number of polices is a difficulty.Skyline computation can get a better control the trade off between data utility and data privacy,it filters out a set of interesting policies from a potentially large set of policies.A policy is interesting if it is not dominated by any other policy,that is,neither the data utility and data privacy those policies filtered out are better than that remained.But it is yet challenging for efficient skyline processing over large number of policies.The skyline computing over the universal policies set(SKY-FILTER-MR)provides an effective and extensive method with high precision for skyline processing over large scale policies.First,applying the MapReduce programming model to traditional skyline over policies can greatly reduce the execution time.This can effectively answer skyline on large scale policies.Second,the approximate skyline sets an effective parameter ? based on skyline.It requires that neither the data utility and the data privacy of those policies filtered out are better in a certain range than that remained.With approximate skyline,the power of filtering was greatly strengthened to effectively decrease the cost of skyline computation over alternative policies.Meanwhile,it can be tuned to trade off the near-optimality guarantee for lower risk and higher data utility by varying the parameter.Extensive experiments demonstrate that SKY-FILTER-MR substantially outperforms the baseline approach by up to four times faster and with the number of alternative policies decreasing up to 732 times in the best case.Meanwhile,it has a good scalability over large policy sets.In addition,the running time decreases with increasing ?.SKY-FILTER-MR reduces the number of alternative policies and improves the efficiency under the guarantee of the accuracy in skyline computing.
Keywords/Search Tags:de-identification policy, skyline, data privacy, MapReduce
PDF Full Text Request
Related items