Font Size: a A A

Research On An Enhanced Identity-reserved Anonymity Approach

Posted on:2018-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:K DuFull Text:PDF
GTID:2348330518957161Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the information age,more and more data are used by people.For example,the data generated by online shopping can be used in the recommendation system;medical data,such as patient's medical records,can be used to study the relationship between medical complications and diseases.However,if these data are published directly,the individual privacy will be compromised.Moreover,simple deletion of the individual identity information in these data can't achieve the purpose of protecting personal privacy,and attackers can infer and analyze the sensitive information of individuals through the correlation between the data.Therefore,privacy-preserving data publishing has attracted more and more attention of researchers.As for the privacy protection method of data publishing,the earliest protection method is k-anonymous model.Although these methods can protect the privacy of the users,they tend to delete individual attributes and ignore the relationship between them.Tong et al.proposed the privacy preserving with identity reservation,but the anonymity models proposed by them did not take the relationship between individual records into account,and still cause privacy leakage.In this paper,we further study privacy preserving method with identity reservation,and our research contents are as follows.First of all,we analyze the problems existing in the privacy models with identity reservation.Although the identity-reserved(k,l)-anonymity and identity-reserved(?,?)-anonymity can prevent the attack of record linkage,they do not consider the relationship among different records,which belong to the same individual,in restricting sensitive values.Secondly,in order to prevent the privacy leakage mentioned above,we introduce the concepts of reasoning space and reasoning set,and propose enhanced identity-reserved 1-diversity and enhanced identity-reserved(?,?)-anonymity.For an equivalence class,the problems whether it satisfies the enhanced identity-reserved l-diversity and enhanced identity-reserved(?,?)-anonymity,are changed to the problems of minimum hitting set and the highest frequency of sensitive values,respectively.Also,we give more reasonable definitions of information loss for numeric and categorical attributes for generalization.Moreover,the distance between individuals,the distance between individuals and the distance between classes are given from the view of information loss.Then,we present a general anonymization algorithm with clustering techniques to make a dataset satisfy a given identity-reserved privacy model.For different anonymous models,it is different in judging whether an equivalence class satisfies given privacy requirement.Firstly we recode identity attributes of original data;and then determine whether the data meets the given privacy requirements.If it is satisfied,we use the clustering method to generate an equivalence class,which satisfies the privacy requirement;otherwise,the residual records are handled.Also the complexity of the algorithm is analyzed.Finally,Experimental results show the vulnerability of identity-reserved(k,l)-anonymity and identity-reserved(?,?)-anonymity.Our enhanced approaches provide stronger privacy preservation,and the information loss and runtime are very near to ones of identity-reserved anonymous methods.
Keywords/Search Tags:data publication, privacy preservation, identity preserving, anonymity, generalization, suppression
PDF Full Text Request
Related items