Architectures and algorithms for scalable wide-area information systems

Posted on:1999-11-02

Degree:Ph.D

Type:Dissertation

University:The University of Texas at Austin

Candidate:Tewari, Renu

Full Text:PDF

GTID:1468390014468110

Subject:Information Science

Abstract/Summary:

The focus of this dissertation is to design an infrastructure for a scalable wide-area information system with support for managing a large number of heterogeneous objects, maintaining low response times, and guaranteeing high availability. To achieve these goals, we focus on the design and implementation of scalable servers and scalable delivery architectures.; The scalable server design consists of a clustered server that eliminates hot-spots by load balancing, supports requirements of heterogeneous data types, and stores redundant data to mask disk and node failure. Conventional techniques were designed for load balancing on a disk array in a single node system and are not suited for multi-node clusters. For load balancing, we propose: (i) data layout techniques for storing data across disks and nodes, and (ii) mechanisms for balancing the load of client connections across the nodes. Next, we design an analytical model of the clustered server that is used to determine the optimal data layout parameters (e.g., stripe unit size, striping width, etc.) for heterogeneous data types. The model also provides insight into the interaction of the interconnection network with the storage disks and its combined effect on the data layout parameters. For masking failures, existing redundant storage techniques neither scale to large number of users nor support the requirements of heterogeneous data types. We design redundant data placement techniques that have the following features: (i) balance the load during normal and failure mode operations, (ii) minimize the storage and computation overhead, (iii) support optimizations for different data types, and (iv) scale to a large number of users.; Since effective caching techniques are the cornerstone for handling exponential growth and improving performance, the scalable delivery architecture consists of a distributed caching and replication system. We design and implement a distributed cache architecture to achieve the following goals: (i) improve performance, (ii) increase scalability, and (iii) support different data types. To address the goal of improving performance, each cache in the distributed cache architecture stores location hints to reduce the delay in locating and accessing information. Further, we develop dynamic push caching algorithms to move the data closer to clients, thereby, improving performance. For scalability, we design a self-configuring meta-data hierarchy that is used to efficiently propagate locations hints among an increasingly large number of caches without incurring high space and bandwidth overheads. To support heterogeneous data types, we design a cache replacement policy that manages the cache resources of space and bandwidth among multiple data types. Experimental results show that using these techniques results in a factor of about 2.5 speedup in average client response times. We provide details of a prototype implementation that incorporates these techniques and is deployed on the Internet. (Abstract shortened by UMI.)...

Keywords/Search Tags:

Scalable, Information, System, Data types, Techniques, Support, Large number, Architecture

Related items

1	Scalable techniques for network control and evaluation
2	The Study Of Several Key Issues On Large Data Sets Classification Techniques In Pattern Recognition
3	Scalable real-time architectures and hardware support for high-speed QoS packet schedulers
4	Scalable Data Transformations for Low-Latency Large-Scale Data Analysis
5	A scalable information management middleware for large distributed systems
6	Scalable and robust clustering and visualization for large-scale bioinformatics data
7	Support for Scalable Analytics over Databases and Data-Streams
8	The Design Of Mobile Number Portability Business Support System
9	Research And Implementation On The Highly-scalable Distributed Interactive Simulation Support Platform
10	The Management And Support Architecture Research Of Agile Enterprise Information Systems