Font Size: a A A

Research On Triangle Computation Of Large-scale Graphs

Posted on:2021-02-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y X ZhangFull Text:PDF
GTID:1488306107955749Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,big data processing is attracting considerable research interest.Large-scale graphs exemplified by Web graphs and online social networks are a widelyused type of big data,which increase fast in scale and present higher challenges to processing systems.In graph processing applications,a triangle is an important concept to measure the tightness of the connections between vertices,and is an important basis of graph structural analysis.Triangle Computation(TC),referring to enumerating all the triangles in a graph or counting the number of triangles,is an important fundamental tool in graph processing and has been extensively used in real-world applications such as spamming activity detection,content quality assessment of online resources,and analysis of protein interaction network.The key challenges of TC on large-scale graphs originate from the random memory accesses and massive network messages caused by the the randomness of the connections between vertices.Because shared-memory TC can avoid network messages and utilize the relatively high bandwidth and random-access support of RAM,it is usually more efficient than TC on a small-or medium-scale cluster in terms of cost effectiveness,ease of programming and processing efficiency.However,the limited RAM capacity of a single machine and the huge sizes of large-scale graphs restrain the applicability of sharedmemory TC and highlight the importance of graph compression,in which the main challenge is the guarantee of reasonable compression ratio and high speed simultaneously.To solve the problems,we propose CIC-PIM(Chunked Index Compression for Parallel In-Memory graph processing),a light-weight compression scheme,in which we choose suitable encoding scheme to compress non-index data;to compress index data by exploiting the ubiquitous sparseness and power-law degree distribution of largescale graphs,the idea is dividing index structures into chunks of appropriate size to obtain good compressibility,and compressing the chunks with our fixed-length bytealigned encoding to achieve reasonable space saving,full-random access support and high cache efficiency.Compared with Ligra+,a typical graph compression scheme for shared-memory setting,CIC-PIM achieves a compression ratio of 69% on index data and improves the whole-graph compression ratio by 24%(to 52%)and the processing speed by 8%.Distributed in-memory TC is a reasonable solution to processing larger graphs in memory,and the key challenge is how to drastically reduce or even avoid network messages.Thus,we propose Lite TE which includes three main techniques:(1)dividing graphs into huge partitions to fully exploit the big memory of modern computers,and solve the problem of massive network messages along with the techniques such as message coalescing.(2)to solve the load imbalance problem and avoid high overhead,Lite TE proposes a three-level technique including subgraph-,node-and thread-level load balance techniques.(3)to exploit bidirectional network bandwidth to speedup the broadcast of huge data chunks,all nodes pass data in lockstep with each other,and improve data transfer speed significantly.Experimental results show that,compared with Surrogate,Havoq GT and PTE,Lite TE dramatically reduces network messages,gains good load balance effect,improves computing speed by 49× and improves load balance considerably.When graphs are larger than the aggregated RAM of the cluster,we take distributed out-of-core TC into consideration.In addition to the same problems caused by massive network messages in distributed in-memory TC,we need consider how to significantly reduce intermediate data and fully use the aggregated I/O bandwidth of the cluster.Because existing solutions are inefficient due to massive network messages,massive intermediate data and low resource utilization,we propose HOSA(Hold One and Shift Another).To exploit I/O bandwidth and reduce intermediate data,HOSA proposes a non-overlapping graph partition and placement strategy in which the neighbor list of a graph is evenly divided into chunks and the edge list is correspondingly divided into blocks,and non-overlapping subgraphs obtained.Then,subgraphs are distributed to each node in a balanced manner and network messages are avoided with I/O bandwidth effectively used.To effectively use resources to speedup computing,we propose a data pre-passing computing strategy,in which computing is divided into interlaced passing stages and processing stages.In each passing stage,the data needed by the following processing stage is passed with the full network and I/O bandwidths of the cluster,and in the following processing stage,all data needed can be found locally.Thus,network messages are avoided and resources are more effectively utilized.Experimental evaluations show that,compared with typical TC algorithms Havoq GT and PTE,HOSA achieves speedup factors of 6.4× and 57×,and compared with distributed in-memory scheme Lite TE,HOSA achieves 75% of its processing speed,but can process graphs of 20× larger.
Keywords/Search Tags:Large-Scale Graph Processing, Parallel Computing, Distributed Computing, Shared-Memory, Triangle Computation
PDF Full Text Request
Related items