Font Size: a A A

A Semantic L-diversity Privacy Protection Algorithm Based On Clustering

Posted on:2015-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:X HanFull Text:PDF
GTID:2348330518970405Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, the popularity of Internet and data storage, promote the development of Intelligent Transportation and Location-based services recommendations. However,the raw data they collected often contain sensitive information, if we release these data without any processing,it would reveal individual privacy information. This triggered a research and exploration for the issue of privacy in data publishing process which objective is to balance the data security and data availability, On the one hand, to ensure data security by losing some certain information of the data, on the other hand retain the important information of original data so that people can use it to analyze.On the basis of studying and analyzing 1-diversity model, we found that 1-diversity model has the advantage of the diversity of sensitive attribute values in each group. However, these models did not consider semantic information of sensitive attributes. Thus, we propose a semantic 1-Diversity privacy protection algorithm based on clustering. Firstly we add the semantic information of sensitive attribute to 1-diversity model,so that each equivalence class contains at least l semantics dissimilar sensitive attribute values. Second we use clustering techniques to divide equivalence classes, under the conditions of semantic 1-diversity it choose the most similar records in the data set and put them into a cluster, in order to reduce the information loss and improve the quality of anonymous data table, we adjust the clusters at the end of the algorithm by decomposing the clusters and merge its content to the nearest clusters. Finally we generalize the quasi-identifier attributes of records in clusters, achieving the anonymization of the table.At last, we carried out the experiment. Experimental results verify that the algorithm presents in this paper can prevent similar attacks effectively and has a low information loss, to keep the availability of data as much as possible to keep the availability of data.
Keywords/Search Tags:Privacy protection, l-diversity, Semantic information, Clustering, Generalize
PDF Full Text Request
Related items