Font Size: a A A

Architectures and algorithms for scalable wide-area information systems

Posted on:1999-11-02Degree:Ph.DType:Dissertation
University:The University of Texas at AustinCandidate:Tewari, RenuFull Text:PDF
GTID:1468390014468110Subject:Information Science
Abstract/Summary:
The focus of this dissertation is to design an infrastructure for a scalable wide-area information system with support for managing a large number of heterogeneous objects, maintaining low response times, and guaranteeing high availability. To achieve these goals, we focus on the design and implementation of scalable servers and scalable delivery architectures.; The scalable server design consists of a clustered server that eliminates hot-spots by load balancing, supports requirements of heterogeneous data types, and stores redundant data to mask disk and node failure. Conventional techniques were designed for load balancing on a disk array in a single node system and are not suited for multi-node clusters. For load balancing, we propose: (i) data layout techniques for storing data across disks and nodes, and (ii) mechanisms for balancing the load of client connections across the nodes. Next, we design an analytical model of the clustered server that is used to determine the optimal data layout parameters (e.g., stripe unit size, striping width, etc.) for heterogeneous data types. The model also provides insight into the interaction of the interconnection network with the storage disks and its combined effect on the data layout parameters. For masking failures, existing redundant storage techniques neither scale to large number of users nor support the requirements of heterogeneous data types. We design redundant data placement techniques that have the following features: (i) balance the load during normal and failure mode operations, (ii) minimize the storage and computation overhead, (iii) support optimizations for different data types, and (iv) scale to a large number of users.; Since effective caching techniques are the cornerstone for handling exponential growth and improving performance, the scalable delivery architecture consists of a distributed caching and replication system. We design and implement a distributed cache architecture to achieve the following goals: (i) improve performance, (ii) increase scalability, and (iii) support different data types. To address the goal of improving performance, each cache in the distributed cache architecture stores location hints to reduce the delay in locating and accessing information. Further, we develop dynamic push caching algorithms to move the data closer to clients, thereby, improving performance. For scalability, we design a self-configuring meta-data hierarchy that is used to efficiently propagate locations hints among an increasingly large number of caches without incurring high space and bandwidth overheads. To support heterogeneous data types, we design a cache replacement policy that manages the cache resources of space and bandwidth among multiple data types. Experimental results show that using these techniques results in a factor of about 2.5 speedup in average client response times. We provide details of a prototype implementation that incorporates these techniques and is deployed on the Internet. (Abstract shortened by UMI.)...
Keywords/Search Tags:Scalable, Information, System, Data types, Techniques, Support, Large number, Architecture
Related items