Font Size: a A A

Research On Privacy Preserving Methods Based On Vector Model In Weighted Social Networks Publication

Posted on:2016-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:L H LanFull Text:PDF
GTID:1108330470960906Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Social networks are relative and stable relation system by interactions between the individuals, models of many social phenomena and one of the most representative real networks of complex networks. With the increasing of social networks, more and more social individuals sign in social networks, a large number of information is collected and acquired. To meet the need of scientific research and data sharing, data collectors need to release social networks datasets, but the published datasets contain sensitive information of social individuals, so the publications make social individuals privacy under threat. With the continuous improvement of the public understanding about privacy, privacy issues have become the main obstacles. To ensure social individuals privacy safety, the privacy preservation should be carried out in social networks publication.At present, the mostly existing research results on social networks privacy preservation tend to focus on un-weighted social networks. In un-weighted social networks, the connections between individuals belong to Boolean relations. It can only explain whether there is interaction between social individuals and can not identify the interaction strength between social individuals. The more and more empirical researches show that the connections between individuals are not entirely the Boolean relations, but there are different coupling strengths. Such as closeness of relationships between people, bandwidth on Internet, the number of flights or seating between the airports in aviation networks, the cooperation numbers between scientists in science cooperation networks are the important factors influencing the networks character. Therefore, it is necessary that a physical quantity is introduced to measure the coupling degree between the nodes in the topology structure of social networks, that is, applying a weight on the edge between two nodes to evaluate the relationship strength between two nodes. Due to the introduction of the edge weights in weighted social network, it contains much abundant information than un-weighted social network, so the research on privacy protection of weighted social network publication is a very necessary and meaningful work.The local disturbance privacy preservation methods based on vector model were put forward to achieve data publication aiming at weighted social networks, more details is as follows:(1)The publishing scenarios of weighted social networks were defined according to the two performance indicators-privacy protection quality and release data utility. The release scenarios must firstly determine for privacy protection of social networks. The explicit background knowledge of the attackers, the release dataset purpose and privacy information are necessary to take effective protection strategy and design privacy protection methods. The two important performance indicators are privacy protection quality and release data utility on social networks publication. According to the characteristics of the release datasets and practical demands, data publishers could face three choices. First, the release purpose is as much as possible to improve the data utility on the premise that obtaining acceptable privacy protection quality. Second, the release purpose is as much as possible to improve the privacy protection quality on the premise that obtaining acceptable released utility. Third, the release purpose is to balance privacy preservation quality and release utility to achieve the compromise. The three scenarios were defined in this paper. In each scenario, the nodes of weighted social networks are private information, including the edge weights between nodes, the release purpose is to analysis structure characteristics, focusing on average clustering coefficient, average path length and weight distribution, the three background knowledge of attackers respectively is degrees, sub-graphs and edge weights about nodes.(2)The vector models were proposed as the published models of weighted social networks. Based on the edge space theory in graph theory, the authors applied vector to describe weighted social networks. To reduce the vector dimensions, the two methods were used to construct vector set models of weighted social networks based nodes segmentation. The segmentation methods expressed weighted social networks as many sub-graphs, described the sub-graphs by vectors and constructed vectors sets models of weighted social networks. The sub-graphs are sparse graphs comparing with the dense graph with the same nodes. Through perturbing vectors of these segmentation subgraphs, it can realize the local disturbance strategy of weighted social networks and achieve privacy protection of the weighted social networks.(3)Focusing on the demand to improve the release utility, a random perturbation privacy preserving method based on vector similarity was put forward on weighted social networks. The method adopted weighted Euclidean distance as the vectors similarity metrics, constructed the candidate sets of the released vectors of the sub-graphs according to the threshold designated by the publishers, randomly selected vectors from the candidate vectors sets of sub-graphs to combine them as the released vectors sets of weighted social networks. The method built the final weighted social networks publication according to the released vectors sets. The proposed privacy preservation method can force the attackers to re-identify in a large result set that the existential probabilities of the vectors are the same and increase the uncertainty of recognition. The method improved the similarity of vectors in candidate sets of sub-graphs, maximize the similarity the original social network and the released networks and improve the release utility.(4)Focusing on the demand to improve the privacy preservation quality, a vector mapping method based on differential privacy model was put forward to implement weighted social networks publication. The method used the characteristic of strong privacy protection of differential privacy model and designed the WSQuery query model to meet with differential privacy on weighted social networks. The WSQuery model could capture the structure of weighted social networks and returned the triple sequences as the query result set. The WSPA algorithm was designed according to the WSQuery model, could map the query result set into a real number vector and injected Laplace noise into the vector to realize privacy protection. The LWSPA algorithm was put forward because of the high error of the WSPA algorithm, partitioned the triples sequence of the query results into multiple subsequences, constructed the algorithms for each subsequence according with differential privacy and reduced the error. The method could meet the release data utility demand and improved the privacy protection quality.(5)Focusing on the demand to balance the privacy preservation quality and the release utility, a vector mapping method based on random projection was put forward to implement weighted social networks publication. The method described weighted social network as high-dimensional vector, mapped the original high-dimensional vector set into low-dimensional target vector set by reduce dimension based on low distortion mapping of random projection. The dimensionality reduction could remove redundancy information and implement privacy preservation by numerical distortion method through the dimension reduction transformation. On the basis of the basic vector set random projection method, the improved vector set random projection method was put forward for avoiding random projection transformation matrix revealing that resulted in the reconstruction of the original dataset. The method constructed random matrix elements adopting two random functions combination and proved that the random mapping based on the matrix could meet the conditions of Johnson-Lindenstrauss lemma. The method could enhance the privacy preservation quality and obtain higher release data utility simultaneously, could balance privacy protection quality and released utility.(6)The simulation experiments were conducted on six real datasets for the three privacy preservation methods based on vector mapping in this paper, compared with the existing algorithms, analyzed the performance of each method and verified the effectiveness of the proposed method. The execution time of the algorithms based on three privacy preservation methods was analyzed. The experiments comparing with six algorithms in specific privacy attacks were conducted. The privacy preservation quality was measured from three kinds of background knowledge based on degrees, sub-graphs and weights to recognize the nodes. The release data utility was measured from three structure characteristic parameters, such as the average shortest path, the average clustering coefficient and the weight distribution. According to the experimental results and analysis, the proposed three privacy protection methods could meet the demand of respective release scenario and balance privacy preservation quality and release data utility.
Keywords/Search Tags:weighted social networks, privacy preservation, vector models, edge space, differential privacy, random projection
PDF Full Text Request
Related items