Font Size: a A A

Research On Key Technologies Of Privacy Protection In Data Publishing

Posted on:2011-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:C HuangFull Text:PDF
GTID:2178330338476306Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the development of electronics and network technology, personal information is being collected and shared extensively. That leads to a fact that privacy disclosure is becoming an important security problem nowadays.In the field of data publishing, private data is open to the public which can be accessed by anybody. Thus a key issue is how to protect private information in published data from being abtained by malevolent attackers while keeping enough data utility so that receivers could do their research work effectively. This thesis focuses on the technology of privacy protection in data publishing, in which generalization is a main method. The main contributions are as follows:(1) Introduce the the existing policies and algorithms of anonymization, analyze problems of those methods, and take a discussion about a few new questions in the field.(2) Propse a generalization algorithm based on self-revision of attribute hierarchies called RIncognito which is an improvement of the k-anonymity algorithm Incognito. This algorithm considers the distribution of attribute values in the meta data, and makes a partition to the attribute domain which merges the values having small frequency to get better hierarchies. It can improve data accuracy because of the reduction of unneccerary generalization. Experiments show that we get higher data accuracy after generalizing data using this RIncognito.(3) Propose new metrics of data privacy and accuracy. Privacy of generalized data still can not be accurately calculated. We define a metric, Average Probability Rate that determines the data privacy quantitatively. In the meantime, we analyze the exsiting metrics of data accuracy and define a new one based on information theroy, that is Weighed Attributes Entropy. It mesuares how much information a generalized value delivers. Finally we show the conection between these two metrics through experiments.(4) Designe and implement the Data Publishing Module in NHSecure, a secure DBMS developed by our research team. We introduce the privacy protection mechanism of k-anonymity policy to the implementation of data generalization and publishing. At last we show that privacy protection mechanism enhances security of the module through instances.
Keywords/Search Tags:secure database, data publishing, privacy protection, privacy metric, accuracy metric, generalization algorithm
PDF Full Text Request
Related items