Font Size: a A A

Research On Link Prediction Based On Subgraphs Association Rules

Posted on:2013-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:W Q YuanFull Text:PDF
GTID:2248330362474787Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The aims of traditional data mining are to find the association between the singleentity attributes. But actually relationships between entities also have very richsemantics. Graph-based method is very suitable for representing multi-relationaldata. Each vertex in the graphs represents an entity, edge between them represents therelationship, and the edge can be directed or undirected. So graphs are goodrepresentations of entities and the relationships between entities. The flexibilitydefinition makes graphs can model a variety of process. Graphs can represent theprocess that relationships between entities change over time. This representation can beeither static or temporal. The temporal network is composed by a sequence of staticgraphs; each graph is a snapshot of the relationship in a specific moment or between avery small period of time.Link mining is a newly emerging research area that is at the intersection of the workin graph mining, link analysis, web mining, relational learning and inductive logicprogramming. Most of the current graph-based link predictors make use of the graphtopology prediction. Most of time, useful links are relatively dispersed. Existingmethods need to scan all vertexes and edges, which will lead to do some useless work.And when graphs are very big, these models need to pay a high computational cost.In this paper, a link prediction method based on subgraphs association rules isproposed. On the one hand, traditional association rules are based on a relationaldatabase, so they are not suitable for the network which represents the structureddata. On the other hand, mining frequent subgraph from networks has been relativelymaturely studied. Frequent subgraphs can filter out the meaningful edges of anetwork. In this paper, frequent subgraphs are applied to the association rules, namelyfrequent subgraphs are deemed to be the items of association rules. Firstly, this methodcan help to improve the tractability of predicting, because frequent subgraphs canremove the sparse edges of graphs, and the number of frequent subgraphs isasymptotically smaller than the number of edges in a graph. Secondly, subgraphsassociation rules can combine subgraphs with the different particle size, rather thanusing a single subgraph, which helps to ensure the accuracy and coverage of thepredicting.In this paper, two types of link prediction algorithms based on frequent subgraphs are proposed for streaming network data, such as temporal network etc. Differentwith traditional methods that can predict links which have not occurred before, bothmethods presented in this paper is to predict the links that have occurred at some pointin the past. The first one called subgraphs association rules, they are similar to thetraditional association rules and can be both used in the timing network and used inother streaming network data. Subgraphs association rules are time-independent, andcan’t predict the occurrence time of one link. The second is temporal subgraphassociation rules. They are used in temporal network, can predict when the observedlink will happen again in the future. The experiment of the artificial dataset shows thatsubgraphs association rules method can improve the accuracy of link prediction; and theexperiments of Enron email dataset and IMDB photo network show that temporalsubgraphs association rules have a high accuracy of link prediction.
Keywords/Search Tags:Frequent Subgraphs Mining, Streaming Network, Link Prediction, Subgraphs Association Rules, Temporal Subgraphs Association Rules
PDF Full Text Request
Related items