Font Size: a A A

Research On K-anonymous Algorithm Based On The Sensitive Degree Of Privacy

Posted on:2016-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:F J YangFull Text:PDF
GTID:2308330482481224Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The emergence of cloud computing means that the era of big data has come, the future era is not only the information age but also the big data era. A large number of potential information is mined because of the massive data’s existing, which become a wealth of great value in the information age, bringing more benefits to the enterprise and national. At the same time a huge number of individual privacy data is accompanied is found, causing sensitive data leaked. The problem of privacy protection in the age of big data can not be ignored, the protection of sensitive data being leaked has become an urgent problem to be solved.When releasing or sharing the data sheet, the implementation of privacy protection technology has need to consider the following two aspects:(1)First of all to ensure the releasing data privacy won’t be leaked;(2)After the anonymized data released we can also mine the date with high efficiency and practical value. Therefore, It is a problem that to avoid privacy disclosure at the same time guaranteeing the date’s authenticity and high efficiency for the privacy preserving anonymity technology.Through a series of anonymity operating is the basic idea of K-anonymous privacy protection technology, then the original data set generalizes to anonymous data set, at last this technology can make sure that the sensitive data not be found and at the same time the date is distributed. Because the K-anonymous model is easy to be attacked by background knowledge and the link’s attack, and also the homogeneity and similarity attack, this paper based on this anonymous K-model puts forward a new model based on the degree of protection of sensitive privacy.In order to minimize the loss of information, my algorithm first minimizes the information loss as the principle of clustering. Because of the influence of quasi identifiers for sensitive attribute, makes the clustering results with single sensitive attribute equivalence class, the attacker has confidence to infer the user’s sensitive data is very high, resulting in leakage of privacy data. Therefore, on this basis my paper proposed again the idea of clustering, which is based on the clustering algorithm to protect privacy degree. Firstly, I defined the sensitive attribute for the privacy protection degree, and compute the standard deviation and distance of the sensitive attribute for the privacy protection degree the in the equivalence classes. The equivalence classes of sensitive attribute the privacy protection degree standard deviation is smaller, the indicating a sensitive attribute the privacy protection degree is closer. Based on minimizing the information loss as possible, the privacy protection degree of sensitive in the equivalence class attribute from the nearest equivalent class of generalization to the same equivalence class, ensures the diversity of sensitive attribute equivalence class, resist homogeneity attack and similarity attack, to better protect the privacy of data is not being leaked.This algorithm and the basic K-anonymous algorithms are compared and analyzed by experiments in two aspects:running time and information, Shows that the running time of this algorithm is less. Although the model exists slightly higher information loss, but the obtained data is better protected, so it is acceptable.
Keywords/Search Tags:Privacy protection, K-anonymity technology, The loss of Tuple information, Privacy protection degree of sensitive attribute, clustering
PDF Full Text Request
Related items