Font Size: a A A

An Anonymous Privacy Protection Method For Multiple Sensitive Attribute Data Publishing

Posted on:2021-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:H Z MeiFull Text:PDF
GTID:2518306047998829Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and data mining technology,data is constantly being generated,published and used.However,while enjoying the convenience brought by data sharing,people also face the risk of privacy disclosure.How to ensure the availability and security of published data has been studied by the academia.This paper focuses on the issue of anonymous privacy protection in multi-sensitive attribute data publishing.Through the analysis of the research status,the shortcomings of the existing model are pointed out.First,the correlation between attributes is ignored by existing models,and all non-sensitive attributes are generalized,resulting in high data hiding rate,large amount of information loss and poor data availability.Second,the existing model does not implement effective hierarchical protection of sensitive attribute values.In view of the above problems,sensitivity classification(?,l)-diversity model based on attribute correlation and its implementation algorithm are proposed in this paper.The specific work is described as follows:Firstly,this paper introduces the anonymization technology in privacy protection,including three classical anonymization models.Then two anonymous models for publishing multi-sensitive attribute data that will be used in comparison to this model are introduced.The problems faced by anonymous publishing of multi-sensitive attribute data are described in detail,and a sensitivity classification(?,l)-diversity model based on attribute correlation is proposed.By protecting the correlation between the attributes,the generalization rate of the attributes is reduced,the availability of published data is improved,the value of sensitive attributes is graded,and the privacy protection intensity of the model is enhanced.Secondly,this paper presents two sub-algorithms to implement(?,l)-diversity model.In the attribute partitioning algorithm,the correlation between quasi-identifier attributes and sensitive attributes is first determined,then the attributes are preprocessed,and then a new method of sensitivity level definition is given.This method defines the sensitivity level according to the diversity L,making the distribution of defined sensitivity levels more uniform.Based on the anonymous equivalence group generation algorithm of frequent itemsets,the prefix tree is constructed by using the idea of fp-growth algorithm,and then theprefix tree is traversed to generate the equivalence group,so that the correlation between attributes is protected to the maximum extent,and the equivalence group can be generated more reasonably.Finally,the anonymous equivalence group is generated by the generalization technology.Finally,the comparison experiment is designed.Real data sets are selected and different parameters are set for the comparison experiment.The experimental results show that after the data is processed by the sensitivity classification(?,l)-diversity model based on attribute correlation,the information loss of the data is effectively reduced,the correlation between the attributes is protected,and the availability of the data is improved.At the same time,the privacy protection intensity of this model is improved by hierarchical protection of sensitive attributes,and the algorithm of generating equivalent groups based on frequent item sets has higher execution efficiency.
Keywords/Search Tags:anonymous privacy protection, multi-ensitive attribute, attribute correlation, sensitivity level
PDF Full Text Request
Related items