Font Size: a A A

Research On Privacy Protection Of Weighted Social Network Data Publishing Based On Differential Privacy And Closeness Centrality

Posted on:2022-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:M Z GaoFull Text:PDF
GTID:2480306551470954Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Big data and other data based technologies support the development of the field of social network data analysis,and at the same time make the data released by social networks face a major threat of privacy leakage.In recent years,major data breaches have emerged one after another.For example,in the “315 Gala” in 2021,it was revealed that facial information was collected without my consent,major recruitment platforms hawked resumes at will,and personal information was peddled on the dark web.Social networks,especially weighted social networks,contain a large scale of personal or corporate private information.If sensitive data is released without privacy protection processing,criminals can predict sensitive information related to users based on the released data.Compared with the existing privacy protection technology,the differential privacy model stands out and has been applied to protect the privacy of data security in industries and other fields.Differential privacy has advantages that other privacy protection technologies do not have.It can not only resist all background knowledge attacks,but also achieve quantitative privacy protection effects.Therefore,this article will focus on how to use the differential privacy model to improve the privacy security of the protected graph data before publishing weighted social network graph data,while ensuring that the data to be published has good usability.In order to protect the privacy of important nodes and their important edges in weighted social networks,this thesis proposes an edge adding projection algorithm based on node degree sorting and compactness centrality sorting,EGMA(Edge adding projection algorithm based on node degree sorting and compactness centrality sorting).The EGMA algorithm mainly includes three stages: constructing an ordered node set,constructing an ordered edge set,and constructing a generative graph.First,construct an ordered set of nodes according to the degree and tightness centrality of the nodes;then,construct an ordered set of edges according to the ordered set of nodes and edge weights,and add the edges to the generation according to the order of the weighted edges in the ordered set of edges.Figure structure.Experiments show that the EGMA algorithm based on the idea of graph mapping outperforms other algorithms in terms of L1 error and edge retention.The EGMA algorithm can not only retain as much as possible the nodes with large node degree and closeness centrality in the original social network graph structure,but also retain the edges connected to these nodes with large weights.In other words,the EGMA algorithm retains important structural information in the weighted social network graph structure,and at the same time improves the availability of important data in the generated graph structure.In order to protect the private information of the nodes in the graph generated by the EGMA algorithm,this thesis further proposes the degree histogram publishing algorithm based on k-means algorithm DHPAKM(Degree histogram publishing algorithm based on k-means algorithm)that satisfies the differential privacy of nodes.The idea of implementing the DHPAKM algorithm is as followed.Firstly,based on the EGMA algorithm to modify the original graph structure,the generated graph is generated,and the degree histogram of the nodes in the graph is statistically generated.Then,the DHPAKM algorithm improves the k-means clustering algorithm.The DHPAKM algorithm randomly selects an initial center point,finds and protects the privacy information of the new initial center point in the iterative process,and protects the privacy information of k initial center points through differential privacy.Then,clustering and summarizing the buckets of the histogram Grouping.After the bucket grouping is completed,the DHPAKM algorithm adds noise to different groups.Finally,the DHPAKM algorithm publishes a degree histogram that satisfies ?-differential privacy.Experimental analysis shows that,compared with other algorithms,the L1 error and KS distance of the DHPAKM algorithm on the data set are both smaller.The DHPAKM algorithm has a better privacy protection effect on weighted social networks,more completely retains the node degree information in the original graph structure,and the availability of published data is higher.Aiming at the privacy protection of the community structure in complex weighted social networks,this thesis proposes a community structure differential privacy algorithm based on improved label propagation algorithm CSDPA-LPA(Community structure differential privacy algorithm based on improved label propagation algorithm).The CSDPA-LPA algorithm uses node strength and node closeness centrality to improve the label propagation algorithm,and then conducts community detection on social networks and divides the community structure.In order to further protect the structural privacy information of the community,the CSDPA-LPA algorithm generates a disturbed community structure by adding noise to the side frequencies.In order to protect the privacy information of the edge weights in the community structure,the CSDPA-LPA algorithm converts the privacy protection of the edge weights in the community structure to the differential privacy protection of the edge weight sequence.Then,merge the communities to generate a noisy social network that realizes edge differential privacy and publish it.The experimental results show that the CSDPA-LPA algorithm is better than other algorithms under the WARE and ASPL indicators.The CSDPA-LPA algorithm not only realizes the protection of the network structure,but also protects the privacy information of the edge weights,and at the same time ensures the utility of the social network data.Based on the above algorithm,this thesis constructs a data publishing privacy protection system Data Share based on differential privacy,and tests the privacy protection effect of the algorithm proposed in this thesis on graph data.The test results of the Data Share system show that the privacy protection algorithm proposed in this thesis has a better privacy protection effect,and the published data is more effective.
Keywords/Search Tags:Differential privacy, Data release, Privacy protection, Closeness centrality, Social network
PDF Full Text Request
Related items