Font Size: a A A

Research On The Problem Of Finding Optimal Value On Quasi-Identifier For K-Anonymity Privacy Protection Model

Posted on:2011-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:D L WangFull Text:PDF
GTID:2178330338991190Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
K-anonymity is an important method to prevent private data from disclosing in view publishing. The value on quasi-identifier is a key factor which affect the degree of privacy protection and data quality of k-anonymous tables. After generalization trees of quasi-identifier attributes have been generated, how to find the optimal value on quasi-identifier is very important for anonymous table to meet the privacy protection requirements and to achieve high utility. The present situation of view security problem is analyzed in detail, and the problem of fining the optimal value on quasi-identifier in the k-anonymity privacy protection model is presented and studied in depth.Firstly, The definition of the optimal value on quasi-identifier is given, After the generalization trees of quasi-identifier attributes have been generated, in order to increase the utility of the anonymous table, the problem of finding the optimal value on quasi-identifier is presented, then the problem of finding the optimal value on quasi-identifier is proved to be a NP-complete problem.Secondly, based on the problem of the optimal value on quasi-identifier, the greedy algorithm of finding the optimal value on quasi-identifier is presented, and the data quality of anonymous table is taken into account, this algorithm not only protects the anonymous table from leaking privacy but also finds the optimal value on quasi-identifier approximatively.Moreover, using the metrical formula of information loss which is according to the practicability, the method of finding the optimal value on quasi-identifier based on the chain enumeration tree is presented, this method find the best chain by constructing the chain enumeration tree dynamically. Many means of reducing the seach space are used to improve the search efficiency. Then use the best chain to solve the optimal value on quasi-identifier, and the algorithm of finding the optimal value on quasi-identifier based on the chain enumeration tree is presented.Finally, all the above mentioned algorithms are validated by experiments. and the experimental result is given and analyzed, The feasibility and validity of algorithms are proved.
Keywords/Search Tags:k-anonymity, data quality, generalization tree, optimal value on quasi-identifier, NP-complete, chain enumeration tree
PDF Full Text Request
Related items