Font Size: a A A

Research On Sensitivity-adaptive Privacy Model For Set-valued Data Released

Posted on:2017-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:L H ChenFull Text:PDF
GTID:2308330488475454Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of computer technology, the Internet-based application produces large-scale data, such as shopping records, web search and query logs electronic health records. Publishing the set-valued data has great significance, thus it can provide services for behavior prediction, commodity recommendation and information retrieval. However, these data may contain the individual privacy information. If the data owners released the data directly without masking the data, users’private information will be leaked and potential risks will be brought to some users when any attacker obtains the dataset. Only deleting the identity information in data can not achieve the purpose of protecting the privacy information, and sensitive information can be inferred through the complex correlation between different records. The data privacy protection technology relates to a lot of research fields, such as data analyzing, information security, uncertainty methods and so on. Therefore, it is necessary that masking the original data before publishing the data set. The set-valued data’s privacy protection research has been becoming a hot spot.The earliest methods for set-valued data’s privacy protection include k-anonymous model and some extension methods are based on it. Then the p-uncertainty privacy model has been researched later, and it is better to meet the characteristics of set-valued data from the privacy and utility. But the p-uncertainty privacy model makes all sensitive items into the same sensitivity level, and it asks that the probability that user’s sensitive item is inferred by any attacker is not greater than a unique threshold. In other words, the attacker can infer some sensitive items with a certain probability which is less than p. As the distribution of sensitive items is uneven and the item’s degree of sensitivity is different, if we ignore these characteristics, it will lead to over-suppression for high sensitivity items and over-generalization for low sensitivity items. It reduces the utility of anonymous data. Obviously, it is unreasonable using a unique privacy threshold for whole anonymous process. The main research work is as follows:First of all, a sensitivity-adaptive p-uncertainty model has been proposed to prevent over-generalization and over-suppression by using adaptive privacy thresholds. Thresholds, which accurately capture the hidden privacy features and personalized needs for user of the set-valued dataset, are defined by uneven distribution of different sensitive items.Secondly, a fine-grained privacy preserving technique which, through Local Generalization and Partial Suppression, optimizes a balance between privacy protection and data utility has been used. In each top-down recursion, all items are specialized in the hierarchy tree and records are assigned to different sub-groups. In each top-down recursion, all items are specialized in the hierarchy tree and records are assigned to different sub-groups. If the sub-group does not meet the sensitivity-adaptive p-uncertainty model, partial suppression was used to mask the group. Our anonymous method applies two algorithms in the whole anonymous process, so that information loss of anonymous data set is minimized.At last, all experiments use the real datasets based on the proposed approach in the paper. The evaluation measures including information loss, utility of the anonymous data and the algorithm scalability. All experiments show that our method effectively improves the utility of anonymous data. And it protects the user’s privacy information. Furthmore, it raises the study value of the dataset.
Keywords/Search Tags:set-valued data, data publication, privacy protection, Generalization and Suppression
PDF Full Text Request
Related items