Font Size: a A A

Processor-embedded distributed storage for high-performance I/O

Posted on:2005-02-01Degree:Ph.DType:Thesis
University:Northwestern UniversityCandidate:Chiu, Steve CFull Text:PDF
GTID:2458390008491981Subject:Engineering
Abstract/Summary:
Processor-embedded I/O devices, or smart storage devices, with their on-device memory, storage controller, and network interface, can effectively be viewed as processing elements with attached storage. The growing size and access patterns of today's large I/O-intensive applications require architectures whose processing power scales with their storage capacity. Performance gap between processor, memory and I/O technologies necessitates novel I/O architectures that provide low-latency and high-bandwidth data delivery, in addition to reliability. This dissertation investigates the performance and reliability of fully distributed and offloaded smart storage architectures.; While user-level code offloading and distributed processing represent the high-level model for the storage architecture, this research also considers the devices employed within the model. Both traditional disk-based and emerging micro electro-mechanical system (MEMS) based storage devices are incorporated into the smart storage model. Due to the fundamental differences in data organization and access between these storage devices, it is desirable to evaluate their impact on future high performance storage architectures. Such impact could result in a shift in the basic characteristics of a workload.; The approach with representative I/O-intensive workloads demonstrates that offloading essential primitives and performing point-to-point data communication improve performance over conventional server-based, centralized architectures. It indicates that a distributed smart storage system provides desirable performance and scalability to efficiently process I/O-intensive tasks, including basic database operations, commercial decision support system (TPC-H) queries, association rules mining, high-dimensional subspace data clustering, as well as 2-dimensional fast Fourier transforms on complex matrices.; This dissertation attempts to address the issue of reliability and availability in a distributed smart storage system. The objective is to provide the required level of fault tolerance and data recovery without sacrificing overall performance. Based on RAID configurations, the proposed schemes select the data distribution most suitable for the access pattern of a target workload, and provides sufficient data redundancy and check-pointing during processing. Together with the performance studies, the fault tolerance designs support the thesis put forth by this work.
Keywords/Search Tags:Storage, Performance, I/O, Distributed, Processing
Related items