Font Size: a A A

Research On The Privacy Of Set-valued Data And Its Social Network Data Publication

Posted on:2018-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:S LinFull Text:PDF
GTID:2348330518456588Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development and popularity of network,a variety of applications have produced massive amounts of data,such as weChat,facebook,online shopping platform.There are a lot of potential association rule which have immeasurable social and economic value among these data.For example,the information can be used to analyze the behavior of people and assist in business decision.When pulishing data to data miner,the data need be protected.Because these data usually contain a lot of individual privacy information,which can lead to disclosure of privacy information.The data privacy protection is so important that data privacy protection is a hot research area and there are many related research effort in the past years.But the existing research is mainly for single data on privacy protection and in the era of large data.Data mining and analyzing have used multi-source data.For example,social network and transactional data can be used for data mining to solve the cold start of shopping recommendation system.With the background knowledge increasing,it will bring some privacy problems,which the existing privacy protection method can be useless for multi-source association data.Compared with the relational data,the set-valued data has the features pf high dimension and sparsity so that these relational data privacy protection method is not appropriate for set-valued data.For example,if k-anonymous privacy protection model is used for set-valued data,the information of the data loss is quite great.For the set-valued data,?-uncertainty model can balance the privacy protection and the information loss.In recent years,based on p-uncertainty model,there are many research results on privacy protection of set-valued data.For social network,there are many privacy protection models on social networking privacy issues,such as k-degree anonymous,1 diversity.These protection models are generally via adding or deleting nodes and edges to satisfy the privacy requirements.These protection models above can protect single data.But in the case of social network data and set-valued data releasing together and the background knowledge increasing,the probability of leakage information is greater than p,which does not achieve the data privacy requirements.The main work of this paper is as follows:Firstly,we analyze the existing privacy protection model of set-valued data and social network data and propose a attack model of publishing both of them.These existing single data privacy protection model is no longer applicable on the proposed attack model.With background knowledge of any data item in set-valued data,the ?-uncertainty model ensures that the probability that the attacker can deduce the sensitive data item doesn't exceed ?.The model is effective while single set-valued data is published.However,if the attacker also knows how many friends the victim has in the social application,which means the attacker knows the degree of victim in social network data.The attacker can infer the sensitive item of victim in set-valued data is greater than p.Secondly,to resist the attack model above,combining ?-uncertainty model and degree anonymous model,thie paper proposes grouped ?-uncertainty privacy protection model.The protection model firstly build up a artificial generalization tree.For exanple,apple,banana and orange are generalized to fruit.Then according to the generalization tree,set-valued data is divided into groups.The record whose non-sensitive items have the same parent in the generalization tree are divided into the same group.Based on the ?-uncertainty model,the model makes each group satisfies the ?-uncertainty model.It is proved that the whole data satisfies the?-uncertainty model when each group satisfies the ?-uncertainty model.Finally,nodes of social network are also divided into groups(the groups are consistent with the groups of set-valued data)and anonymize the social network data in groups.The group ?-uncertainty privacy protection model requires that the nodes of the social network have the same degree in the same group.Therefore,with the background knowledge above,the probability of privacy disclosure is less than p,so as to achieve privacy requirement.Besides,based on the model above,this paper also designs a protection algorithm.In order to improve the utility of data and reduce the information loss,our algorithm protect data privacy combining with localization generalization and partial suppression method to achieve privacy requirement.In the process of protection,we use top-down local generalization method.But if the generalization doesn't achieve the privacy requirements,we will use partial suppression method to achieve privacy requirements.The generalization will reduce the information loss while partial suppression will increase.Therefore,it is necessary to evaluate the information loss of data before and after the generalization.If the information loss after generalization is less,the generalization will be used.Otherwise,the generalization will be rejected.In order to improve the utility of data,our method will try to protect the community integrity of the social network.We will delete the edges between the communities and add the among the communities preferentially to reduce the influence on community structure.In the end,in order to verify the practicality of the algorithm,the validity of the set-valued data is evaluated with the aspects of information loss.The social network data is measured with the Jaccard Similarity coefficient.The experimental results show that our method ensures the protection of privacy and has good data utility at the same time.
Keywords/Search Tags:set-valued data, social network, privacy protection, community structure
PDF Full Text Request
Related items