Font Size: a A A

Dynamic Data Partition In Distributed Information Networking Database Management System

Posted on:2018-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y MaFull Text:PDF
GTID:2428330515489691Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data partition is an important problem in distributed environment.Whether data is partitioned reasonably not only effect load balance but also bring unnecessary communication overhead between nodes,especially for the data that are related to each other.In information networking model,entity in real world is abstracted as object,relation between entities is abstracted as relation between objects.In information networking model,one object includes its attributes and relations to other objects.when one query starts from one object but want to get information of another object,all it need is jump into that object by specified path but not messy join.But in distributed environment,data is partitioned onto different nodes,frequent jump between object will bring vast communication overhead when related data is partitioned onto different nodes.In distributed information networking database system,the main idea to reduce the communication overhead between nodes is to make closely related data partitioned into one processing node as much as possible.In order to achieve this goal,this paper proposes a dynamic data partitioning algorithm based on organization in combination with the characteristics of information network model.This paper first introduces the concept of relevance degree,which is used to measure the degree of tightness between data objects,and dynamically adjust it by querying statistical information,which will effectively exploit the potential association between data objects.And second abstracts the concept of organization combined with the characteristics of the information network model to present the set of data objects that represent a higher degree of correlation between each other,and make rules for organization detection.And third makes plan to move related data to the same node as much as possible with minimunm communication cost,and develop a cut strategy for fat organization.And Finally,guarantees load balance by limiting the maximum available space of each processing node.The dynamic data partitioning algorithm moves the data objects with higher correlation degree to the same processing node,which makes the query complete at one node,reduces unnecessary communication overhead between the nodes,improves query speed,and optimize the query performance in distributed environments.
Keywords/Search Tags:dynamic data partition, communication overhead, relation weight, load balance, organization
PDF Full Text Request
Related items