Font Size: a A A

Research On The Technology Of Graph Clustering And Generation With Differential Privacy

Posted on:2021-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z J LinFull Text:PDF
GTID:2428330623982210Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the widespread use of portable mobile Internet devices,a large amount of personal user data is widely collected by Internet service providers,such as location data generated by various apps,social network data,personal health status data,etc.The collection of these data on the one hand enables service providers to provide users with better personalized recommendation services,but on the other hand it also brings unprecedented challenges to user privacy.Once these private data are leaked,they may threaten the user's property or even life safety.The frequent large-scale data breaches in recent years have further deepened people's concerns in this regard.Therefore,how to fully mine and utilize data under the premise of ensuring private data security has also become a research hotspot in the field of network security.Differential privacy technology can quantify the level of privacy protection in detail,while retaining the availability of data to a certain extent.It has become the de facto standard in the field of privacy protection and has attracted widespread attention in academia and industry.In view of the privacy leakage problems faced by a large number of graph data generated by various network information systems in practical applications,this paper mainly studies the graph data clustering and synthetic graph generation technology under differential privacy,and achieves the following results:1.Aiming at the problem of privacy leakage in the process of graph data clustering analysis,based on a classic graph clustering algorithm SCAN(Structural Clustering Algorithm for Networks),by defining the edge differential privacy of the correlation graph,and reasonably measure the global sensitivity when calculating the similarity of node structures in graphs,a graph clustering algorithm DP-SCAN(Differentially Private SCAN)under the edge differential privacy is proposed to solve the problem of privacy leakage in the process of graph clustering analysis.Theoretical analysis shows that the proposed DP-SCAN algorithm satisfies-differential privacy and has good usability;simulation experiments show that the proposed DP-SCAN algorithm has a higher level on different data sets under the premise of satisfying privacy protection requirements,that is,the structural characteristics of the original graph data are well preserved.2.Aiming at the problem of privacy leakage caused by directly publishing synthetic graphs derived from real graph data,a synthesis graph generation algorithm based on dK sequence model under node differential privacy is proposed,which makes a compromise between the privacy and usability of synthetic graphs.Specifically,firstly,the global sensitivity of the degree sequence is reduced by the graph projection method,and then the 2K sequence of the compressed graph is clustered into several continuous and disjoint subsequences according to its sensitivity,finally,Laplace noise is added to each subsequence to satisfy the differential privacy,and the noisy 2K sequence is used to generate a synthetic graph.Theoretical analysis shows that the designed synthetic graph generation algorithm satisfies-differential privacy;simulation experiments show that the designed algorithm has high availability on data sets of different sizes under the premise of satisfying privacy protection requirements.
Keywords/Search Tags:Differential Privacy, Privacy Protection, Graph Data, Clustering Algorithm, Synthetic Graph, Data Publishing
PDF Full Text Request
Related items