Font Size: a A A

A K_anonymity Algorithm Based On Determining Intervals Of Generalization By Sampling

Posted on:2014-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:M Q LiFull Text:PDF
GTID:2268330425465998Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Current k-anonymity algorithms which due to ignoring the distributing character ofquasi-identifier attribute value for convenience, just take upwardly generalization of the valueof the attribute by mechanical way, while designing the structure for generalization. Whilesome attribute values in datasets appear highly frequent in a small range, they will lead to alot of records aggregation in some equivalence classes. Therefore, there is impressiveimprovement room for algorithm in the availability of information.According to the phenomenon, this article proposes a new the K-Anonymity algorithm–DIGS algorithm. In which based on the sampling technique, the algorithm determine theoverall generation interval of the sen-associated attribute, then determining the generationstructure of attributes by the way of top to bottom.As the division depending to the anonymity period, this article shows respectivelyapplication method of two process in DIGS algorithm, which process includes both the staticanonymity and dynamic anonymity. In the static anonymity phase, the algorithm makes use ofrelated knowledge of statistics and sampling field, through the analysis of the samples todetermine the attribute ultimately generalization interval. Finally, the algorithm can raisepublishing data sets information availability. In the dynamic anonymity phase, by managingthe public-set and hide-set of the several kind of structure object, the algorithm can achievewhen the source datasets have been changed after the release of public-set, anonymous formswill fast update to them.In the article, innovations including: the determination of sen-associated features inquasi-Identifier; DIGS algorithm of raising the data accuracy; the personalized modelwhich support algorithm operation; the generation structure of attributes by the way of top tobottom etc. These contents not only push the process of k-anonymity, but also provide thenew ideas for future research in the privacy protection.The results of the emulational experiments have indicated that, camparing withtranditional k-anonymity algorithms--Datafly algorithm, the loss of information which is brought by DIGS has a notable margin. So much as the ratio will be close to50%while thecapacity of dataset and the value of K being setted as a biggish one.
Keywords/Search Tags:Privacy Protection, Sampling and Statistics, k-anonymity, Intervals ofGeneralization, Anonymous Table Maintenance
PDF Full Text Request
Related items