Font Size: a A A

Study On Privacy Protection Algorithm Based On K-Anonymity

Posted on:2011-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:H L ChenFull Text:PDF
GTID:2178330338481768Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Today, with the rapid development of the information age, more and more data are provided for people to share and use, at the same time more and more attention has been paid for the privacy protection of the released data. So when realizing data, we should both protect the privacy of the data and ensure data integrity as much as possible, then the data can be used for people. A method called K-anonymity was proposed. K-anonymity is a technique that prevents joining attacks by generalization or suppressing portions of the source data so that no individual can be uniquely distinguished from a group of size k. K-anonymity is simple and is easily to be implemented; it also can reduce the information loss at the maximum extent at the same time protecting personal privacy.Of all the K-anonymity algorithms, there is an efficient algorithm called optimal lattice anonymization(referred to OLA) algorithm. It uses a structure called lattice. We get the optimal result through traversing the nodes of the lattice. But the traversal order of lattice nodes of the OLA algorithm cannot reduce the lattice nodes at the most that need to be computed, which increase the run time of algorithm, and because of the existence of the isolated data in the source data,we can't get a good result on the information loss through processing the source data only once using the global K-anonymity algorithm. In this paper we improve the OLA algorithm on these two aspects, and these improvements have better results. Of the next three points, the first two are directed to reducing the run time of the algorithm and the last one is directed to reducing the information loss of the processed data:1. To the traversal sequence of the lattice nodes, we proposed a method which the traversal sequence is due to the product of degree of the node. Experiments show that it can reduce the lattice nodes which should be computed, which decrease the run time of the algorithm.2. To the number of the lattice nodes to be computed, we use the subset nature of k-anonymity, we can mark (remove) more nodes of the lattice after computing a non-k-anonymous node, which can reduce the number of nodes to be computed.3. Although the result of the OLA algorithm is optimal, but it is not satisfactory on the information loss because of the existence of isolated data, the method we use to deal with this situation is to dividing the data into two parts, these two parts are called isolated data and non-isolated data. Next we run the K-anonymity algorithm on the non-isolated data again and next add the information loss of the two parts; we call this method second K-anonymity. Experiments show that the information loss can be greatly reduced.
Keywords/Search Tags:K-anonymity, Degree First, Subset, Second Anonymity, privacy protection
PDF Full Text Request
Related items