Font Size: a A A

Research On Privacy Protection For Publishing Relational And Transaction Data

Posted on:2019-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:S M ZhouFull Text:PDF
GTID:2428330566975957Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the information age,a large amount of information in our lives is rapidly transmitted in the form of data through the Internet and various media.These data include various aspects of people,such as shopping data,medical data,business data,etc.,and institutions and governments can use this data for business applications,livelihood regulation and scientific research.Therefore,it becomes very necessary to study the sharing and use of data.But these data usually contain a lot of personal privacy.If this information is released without processing it will have serious consequences.Since the attacker can locate personal information through the correlation and characteristics between data,merely deleting the Individual's identification information does not prevent the disclosure of privacy.Therefore,the privacy protection in data publication becomes the current research focused areas and requires the protection of privacy while ensuring the utility of the data.At present,there are many privacy protection methods based on relational data or transactional data.However,in real life,many data contain both relational and transaction attributes.For example,a patient has multiple diseases,and a customer purchases multiple commodities,A relational attribute describes the basic information of the patient,and a transactional attribute describes the individual suffering from illness or purchased goods.If the above two types of privacy protection methods are separately used for relational and transaction attributes,the released data still cannot prevent privacy leakage because the attacker can infer individual privacy through the combination of relational and transaction attributes.Therefore,for relational and transaction data,how to design a privacy protection method that can protect user privacy and ensure data utility becomes an urgent problem to be solved.The main research work of this paper is as follows:First,the existing deficiencies in the privacy protection methods based on relational and transaction data are analyzed.Although the(k,k~m)-anonymity protects the users partial sensitive information,it does not supply protection for the case when the number of items exceeds m.Moreover,there is no distinction between sensitive items and non-sensitive items in transaction attributes.All of the transaction attributes are considered as sensitive items,which makes the loss of information of anonymous transaction attributes be larger.Secondly,To solve the privacy issues existing in the current research methods,a(k,?)-anonymous privacy model is proposed,which requires at least k records in each equivalence class and satisfies?-uncertainty and satisfies?-uncertainty.It can guarantee that the attacker Does not infer the sensitive information of the user in the case of knowing the values of the individuals relational attributes and part of the transaction items,and can effectively prevent the privacy leakage.Thirdly,a privacy protection method based on(k,?)-anonymity is proposed.In order to make the anonymous dataset reduce the information loss as much as possible while satisfying the(k,?)-anonymity,the method firstly obtains the initialized clustering structure that uniformly divides the records of sensitive items,and then adopts three relatively optimized strategies to merge equivalence classes are merged,and finally each equivalence class satisfies?-uncertainty.Finally,from the aspect of the utility of anonymous data and the running time of the algorithm,the anonymous method of this paper is compared with the privacy protection method based on(k,k~m)-anonymity.Experimental results show that the method in this paper is significantly less than the original anonymous method in terms of information loss,and our method can provide stronger privacy protection.Although the running time is longer than the original method,we are aiming at a static data set and the anonymous process is performed offline.
Keywords/Search Tags:Relational and transaction data, privacy protection, k-anonymity, ?-uncertainty, data anonymity
PDF Full Text Request
Related items