Font Size: a A A

Research On Mining And Query Processing Algorithms On Dynamic Graphs

Posted on:2014-04-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J YangFull Text:PDF
GTID:1268330392472593Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology, the amount of information in eachapplication field presents explosive growth trend. The relationships among various enti-ties become more and more complex and there are the huge scientific value and commer-cial value behind them. Recently, people study the relationships among entities. Graph,as a general data structure in computer science, is a good way to describe the relationship-s among entities. Graphs have been widely used to model complex relationships amongvarious entities in real applications.However, the relationships among entities often evolve over time. For example, inprotein interaction networks, proteins interact with each other in order, thus the relation-ships of interaction evolve over time. In social networks, the communication relationshipsamong people and communities also change with time. In addition, in road networks, thetime and toll fee through a road are time dependent. Therefore, graphs often evolve overtime.Dynamic graphs are the graphs that evolve over time. They can be divided into t-wo categories:(1) the topologies of graphs change with time; and (2) the data contentrepresented by vertices or edges changes with time or the evaluation model for a specif-ic data object in graph changes with time. We call them topology change and contentchange respectively. Since there are many dynamic graphs in real applications, it is veryimportant to study the problems on dynamic graphs. However, the existing methods forstatic graphs cannot solve the problems on dynamic graphs. The reasons are as follows.First, the static graph model cannot describe the changes in dynamic graphs. Second,the problem definitions on static graphs are not suitable for dynamic graphs. Third, theinherent computational complexity of problems on dynamic graphs is extremely high, theexisting methods on static graphs cannot solve these problems efficiently and effectively.In addition, the existing methods about dynamic data focus on data stream and relationaldata maintenance, these methods cannot handle the complex graph data.This thesis investigates the problems for topology dynamic graphs and content dy-namic graphs respectively, utilizing the knowledge of computational complexity theoryand algorithmic technology. The main contributions of this thesis are as follows. (1) This thesis studies the problem of mining most frequently change componentfrom evolving graphs. An objective function based on “cumulated connectivity change”is proposed to measure the change frequency for a subgraph. A two-way algorithm isproposed to compute the cumulated connectivity change when graph evolves from a snap-shot to the next snapshot. Furthermore, an algorithm based on partition tree is proposedto mine the objective subgraph by removing the vertices not in this subgraph one by one.This thesis analyzes the time and space complexity for algorithms. The experimental re-sults on real-life datasets state the objective subgraph found by algorithm changes mostfrequently in the whole evolving graph and the algorithm is efficient and effective.(2) This thesis studies the problem of mining k-vertex set with the maximum sumof connectivity change among vertices in this set. A new structure named “connectivitytree” is proposed, it can maintain the connectivity information efficiently for every pairof vertices in graphs. An algorithm is proposed to construct connectivity tree. Further-more, an efficient algorithm is proposed to update connectivity tree. This algorithm is anedge incremental algorithm. When graphs are evolving, by this algorithm, there is onlya small fraction of vertices among which the connectivity needs to be updated. A branchand bound algorithm with three efficient pruning rules is proposed to mine k-vertex set.The experimental results on real-life datasets confirm the k-vertex set can reflect and cap-ture the connectivity change in the whole evolving graphs effectively. Therefore, it canserve as a summary of connectivity change of dynamic graphs. The experimental resultsconfirm the efficiency and effectiveness of algorithms.(3) This thesis studies the problem of dynamic optimal path over multi-cost graphs.The problem is proved to be NP-hard in this thesis and it cannot be solved by the exist-ing shortest path algorithms. A branch and bound algorithm with three efficient pruningrules is proposed to compute the optimal path. Furthermore, a novel “k-cluster index” isproposed to make our algorithm more efficient on large graphs. k-cluster index utilizes“contour skyline” to facilitate the optimal path query processing. The problem to com-pute contour skyline is proved to be NP-hard in this thesis. A2-approximate algorithm isproposed to compute contour skyline and we show there is no (2)-algorithm in polyno-mial time. Finally, a vertex-filter algorithm is proposed to enhance the efficiency, whichcan filter a large fraction of vertices not in the optimal path. The experimental results onreal-life datasets confirm the efficiency of algorithm.(4) This thesis studies the problem of the cost-optimal path with time constraint over time-dependent graphs. A two-step algorithm is proposed to find the cost optimalpath with time constraint. In the first step, algorithm computes the minimum cost fromsource to destination. In the second step, algorithm finds the optimal path from sourceto destination. This thesis analyzes the time and space complexity for algorithm. Theexperimental results on real-life datasets confirm the efficiency of algorithm.
Keywords/Search Tags:dynamic graphs, data mining and query processing, connectivity change, frequently change component, dynamic optimal path, cost optimal path
PDF Full Text Request
Related items