Font Size: a A A

Research On Sensitive Attribute Protection For Healthcare Data

Posted on:2020-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:N N ChengFull Text:PDF
GTID:2404330590477077Subject:Information security
Abstract/Summary:PDF Full Text Request
The popularization of network and the rapid development of information technology accelerate the sharing of information,which facilitate people's access to information.Among them,the sharing of healthcare data plays an important role.However,healthcare data contains a large amount of information related to users' privacy.Releasing healthcare data directly will lead to the leakage of users' privacy.In order to protect users' privacy during healthcare data releasing,it is necessary to release them in combination with different data characteristics by dividing them into single-sensitive attribute and multi-sensitive attributes.In single-sensitive attribute healthcare data,background knowledge and similarity attack are caused by the semantic analysis of sensitive attribute values by the attacker,and these two semantic attacks can result in the leakage of users' medical privacy.Most of the existing anonymous models only constrain the number of sensitive attribute values in the equivalent classes,which leads to insufficient semantic constraints on sensitive attribute.To solve this problem,a(?,k)-anonymous model was designed for single-sensitive attribute healthcare data protection.According to the semantic hierarchical tree of sensitive attribute,the model divided the equivalent classes of size k into ? groups,and realized the semantic constraints of sensitive attribute by grouping.After grouping,the records within groups were semantically similar but not identical,and the records between groups were semantically different.Grouping made the records with semantic similarity and semantic difference account for a moderate proportion in equivalent classes,so the model can resist background knowledge attack and similarity attack at the same time.Secondly,in order to improve the utility of data,distance measurement was used to reduce information loss under the premise of semantic constraints.According to hierarchical generalization tree,the equivalent classes were generalized to realize privacy protection in single-sensitive healthcare data releasing.The sensitive attributes of multi-sensitive healthcare data consist of primary sensitive attribute and secondary sensitive attributes,so they are relatively complex.The main sensitive attribute is vulnerable to semantic attack,and the sensitivity of the main sensitive attribute can cause information leakage,as well as the relationship between the main sensitive attribute and other sensitive attributes.Current multi-sensitive attributes protection models tend to focus on resisting single attack,which is not applicable in healthcare data releasing.We designed a healthcare data protection named(?,k,m)-anonymity model for multi-sensitive attributes.By establishing a semantic-sensitivity two-dimensional bucket of the main sensitive attribute,the equivalent classes of size k were divided into ? groups,and the distribution constraints of semantics and sensitivity were realized,so as to prevent the equivalence classes from being attacked by semantics and sensitivity.Then,the frequency of m minor sensitive attributes was constrained to avoid associated attacks,thereby realizing the protection of multi-sensitive attributes healthcare data.Finally,in order to reduce the information loss rate of generalization,distance measure method and hierarchical generalization tree were used.Experiments show that,based on the protection of the privacy in healthcare data,under the condition of small information loss rate and low time-consuming,the(?,k)-anonymity model can effectively resist the double attacks of background knowledge and similarity during single-sensitive healthcare data sharing.(?,k,m)-anonymity model can effectively reduce the overall vulnerability of attacks after the release of multi-sensitive healthcare data,so that the released data can not only resist the semantic attack including background knowledge and similarity attack,but also resist sensitivity attack and associated attack.
Keywords/Search Tags:privacy protection, healthcare data, sensitive attributes, anonymity, distance measurement
PDF Full Text Request
Related items