Font Size: a A A

Components of a scalable distributed relational information service

Posted on:2006-01-24Degree:Ph.DType:Dissertation
University:Northwestern UniversityCandidate:Lu, DongFull Text:PDF
GTID:1458390008469196Subject:Computer Science
Abstract/Summary:
An information service stores information about the resources and services within a distributed computing environment and answers queries about it. This dissertation presents the design of several important components of the Relational Grid Information System (RGIS).; The query rewriting component handles the challenges originating from the powerful features of relational algebra---some complex queries may be too expensive for RGIS. We have developed several query rewriting techniques to trade off the query time with the number of results returned. These techniques make it feasible to provide the flexibility and power of a relational data model in a GIS system while controlling the costs.; The topology generation and annotation component helps to evaluate the performance of such information servers. GridG is a synthetic grid generator that can generate annotated Internet topologies including routers, IP links, and end systems. GridG is the first synthetic grid generator that follows the power laws of Internet topology while maintaining a clear hierarchical network structure. We also discovered interesting relationships among the power laws.; The scheduling component helps to enhance system performance by minimizing mean response time. We studied the performance of size-based scheduling policies as a function of correlation between job sizes and their estimations, and explored several estimators. We have learned that it is feasible to deploy size-based scheduling on RGIS servers.; A centralized relational information RGIS server can not scale with the distributed computing environment. A distributed RGIS can potentially scale by forming an overlay network and sharing load among the servers. Weak consistency needs to be maintained among the RGIS replicas with a content delivery network propagating updates. For efficiency and scalability, we designed a novel overlay multicast protocol that emulates the fat-tree architecture on the WAN 1.; The transfer prediction component is responsible for predicting the update transfer time accurately in a real time manner on the overlay network. We developed a novel TCP flow rate monitoring and prediction framework that can predict serial as well as parallel TCP flow rates accurately with low overhead.; Our next step is to integrate these components into RGIS.; 1This work was done in collaboration with Stefan Birrer and Fabian Bustamante...
Keywords/Search Tags:Information, RGIS, Distributed, Component, Relational
Related items