Font Size: a A A

Research Of Privacy Protection Technology For Multi-sensitive Attributes Based On Dynamic Datasets

Posted on:2012-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:L F ZhangFull Text:PDF
GTID:2178330338492288Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of society, the applications of privacy protection technology in databases have been gaining more and more attention due to the possible divulgence of individual privacy in the process of database applications. For example, data such as population statistics of certain unit and hospital patients have great research value because these data often contains personal information, so information distribution and sharing will be divulged. Datasets in reality are changing at all times. If the original data in static datasets are redistributed directly to the dynamic datasets, it will result in a lot of private information leaks. Therefore, the redistribution of dynamic datasets is facing more challenges.An improved Bucket algorithm was proposed considering multi-sensitive attributes under the dynamic datasets for privacy protection. This algorithm can cope with the addition and deletion of data in dynamic datasets of relational data. Its core ideas are described as follows: First, two concepts are introduced, namely candidate update set (CUS) and pseudo tuple set (PTS). Then corresponding models are designed according to the two sets. CUS can guarantee sensitive attributes of initial data distributed repeatedly to possess integrality. In fact, PTS doesn't exist and the introduction of it is just to meet the requirements for privacy protection of raw data. Second, "m-invariance" and "multi-dimensional bucket structure," are inherited and the improved Bucket algorithm was presented to perform clustering and generalization processing of raw data and find out whether there are privacy leakage among anonymity tables distributed repeatedly. If there are privacy leakages, similar records in CUS are found out and inserted into them; if there is no similar record, then a record in PTS is found out to insert into the privacy leakage and the number of PTS is marked. In this way, when dynamic datasets are distributed repeatedly, the requirements for dataset updating can be met, achieving the privacy protection of dynamic dataset.With the medical data of certain hospital taken as an example, multi-sensitive attributes were investigated based on previous researchers'work in this study. The problems of privacy breaches resulted from the redistribution process of multi-sensitive attributes in the existing dynamic datasets were fully discussed. At last, an improved Bucket algorithm was proposed. Experimental results show that the algorithm have the ability to protect privacy in relational databases. This algorithm has higher degree of privacy protection as well as low occupancy rate of internal storage.
Keywords/Search Tags:privacy protection, dynamic datasets, multi-sensitive attributes, anonymization, redistribution
PDF Full Text Request
Related items