Font Size: a A A

Research On The Compression Algorithm Of Social Network Graph

Posted on:2015-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y J NieFull Text:PDF
GTID:2180330464968846Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recently, with the fast development of World Wide Web(WWW), social networks is becoming popular. Social networks is a complex system which evolves through interactions among a growing set of actors or users, the interaction could be friendships or intermarriages between families. Looked the people as a vertices and the links between them as edges, the social networks can be expressed as a big directed graph, we call it social network graph. The structure of the social network graph is very big, so there is a huge number of vertices and edges in it, and it is very difficult to study. Thus, the research on the compression algorithms of social network graph becomes the first thing to study the social network graph. But the social network graph has many different properties with the web graph, when sorting the nodes with its URL lexicographical, the similarity of the adjacency list is very small, and it makes difficult for the compression of social network graph.By analyzing the properties of social network graph, the paper find the similarity in it. The similarity in social network graph is between the node and the nodes in its adjacency list, Which is because the node has the relationship of friendship or kinship with the nodes in its adjacency list, and the more intimate the relationship between them, the greater the similarity at their socially relationship, so they has great similarity.This paper proposes a new compress scheme, we call it SNComp(Social Networks Compression). SNComp combines the BFS(Breadth First Search) algorithm and the Rabin fingerprinting algorithm. BFS algorithm can implements the hierarchical scan, and store a node at the neighbored place with its children nodes. But there is a problem if only use this method, because the order of all the children is only according to the order of scanning, and didn’t consider the order between the nodes. So SNComp exploits the Rabin fingerprinting algorithm to compute the similarity value of the children nodes with their parent, and then sort all the children by this values, which can ensure that the children sorted by their similarity with their parent node. So the SNComp can ensure the nodes stored at the neighbored place have the most similarity between them. According to this, SNComp uses reference compression, block compression and some other compress methods to compress the social network graph, which can realize the result that the same parts between the nodes only need to store one time and then can reduce the store space.At last, this paper tests on six different data on social networks. The main content to test including the bits per link needs and the time to uncompress and random access. By comparing and analyzing the test results, we can see the correctness of the compression scheme this paper proposes, and there is a development in compression ratio compared with the MPk algorithm.
Keywords/Search Tags:Social network graph, Similarity, Compression, code, Decompression
PDF Full Text Request
Related items