Font Size: a A A

Research On Microdata Anonymity Model Based On Sensitivity Grading And Quantification

Posted on:2019-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:C W ZhouFull Text:PDF
GTID:2428330572455942Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of cloud computing and big data technologies,the demand for open data in scientific research and decision-making is increasing day by day.Because data sets often contain sensitive information that individuals do not want to disclose,privacy protection becomes more and more prominent and becomes a major obstacle to the development of big data.In order to protect the privacy of individuals in the data sets released,researchers in the field of computer science have conducted research on data anonymization technology.In general,data anonymization technology first sets up an anonymity model,and then anonymizes the original data set through generalization,suppression,etc.Finally,a result data set that satisfies the anonymity model is obtained.Among them,the anonymity model constrains the characteristics of the data set,and its main parameters directly or indirectly restrict the privacy disclosure risk.For privacy protection in the era of big data,this paper proposes an improved anonymity model(w,l,k)-anonymity.First,the defects of the existing anonymity model are analyzed.For example,k-anonymity does not restrict the sensitive attributes,so it cannot resist attribute linkage.Both l-diversity and t-closeness do not consider the difference in the sensitivity of different sensitive attribute values in the same sensitive attribute,and protect the attribute values with different degrees of sensitivity to the same degree,indirectly increasing the risk of sensitive attribute disclosure and loss of data utility of the anonymous data sets.Then,for the sensitivity-grading-based anonymity model,the deficiency of the realization mechanism of p+-sensitive k-anonymity and(p,?)-sensitive k-anonymity is summarized.p+-sensitive k-anonymity is based on sensitivity grading of attribute values,and ensures that the sensitivity level of the sensitive attribute values in the equivalence class is diverse,but the aggregation of the high sensitivity attribute values in the equivalence class cannot be avoided.(p,?)-sensitive k-anonymity achieves quantification of sensitivity weight based on sensitivity grading,but the method of quantifying sensitivity is not reasonable enough.The anonymity model cannot limit the overall sensitivity of equivalence classes and does not support quantifying sensitivity of numeric sensitive attribute values,etc.Then,an improved scheme is proposed for the defects of(p,?)-sensitive k-anonymity,and on the basis of this,an improved anonymity model based on sensitivity grading is proposed.For the problem that(p,?)-sensitive k-anonymity is not reasonable enough in quantifying the sensitivity weight,by improving the weighting method of the sensitivity grading,the frequency-sensitivity component is introduced and weighted with the grading-sensitivity to optimize the quantification mechanism of sensitivity weight.For the problem that the difference in the number of tuples in the equivalence class has a great influence on the sensitivity weight,the average sensitivity of the equivalence class is calculated to limit the overall sensitivity of the equivalence class;For the problem that the sensitivity grading has a narrow application scope,the frequency-sensitivity of the category or interval to which the attribute value belongs is calculated,which provides support for the quantification of sensitivity of the numeric attributes and the categorical attributes that are not easily divided into sensitive levels.Based on the real data set,the effectiveness of the improved anonymity model was verified from the three perspectives of resisting identity disclosure risk,sensitive attribute disclosure risk,and data utility.The experimental results show that the improved anonymity model further reduces the risk of sensitive attribute disclosure in anonymous data sets.
Keywords/Search Tags:Privacy Protection, Data Anonymization, Anonymity Model, Sensitivity Rating
PDF Full Text Request
Related items