Font Size: a A A

Architecture conscious information management systems

Posted on:2006-04-16Degree:Ph.DType:Dissertation
University:University of California, Santa BarbaraCandidate:Yu, HailingFull Text:PDF
GTID:1458390008956749Subject:Computer Science
Abstract/Summary:
Due to continuous advances in semiconductor manufacturing, the memory hierarchy has suffered from problems of latency, bandwidth, and cost gaps for decades. The processor-to-memory gap has been partially mitigated by the integration of multi-level caches. However the memory-to-disk gap has been constantly growing. This has resulted in a significant performance bottleneck for data intensive applications.; The goal of this dissertation is to address the I/O bottleneck by exploiting the symbiosis between advanced technologies in hardwares and algorithms capable of maximizing the benefits of the new features provided by these technologies. To this end, we study methods to replace disks with MEMS-based storage devices, which have emerged as the leading candidate for next generation storage devices. Since MEMS-based storage devices have very different characteristics from disks, we exploit their inherent properties when integrating them into existing systems. As a first step, we study the I/O scheduling problem and propose a new scheduling algorithm for MEMS-based storage. Next, we explore MEMS-based storage in the context of relational databases, two-dimensional datasets, and vector sets. We propose the Flexible Retrieval Model (FRM) to place relational data on MEMS-based storage, which enables data retrieval in both row-wise and column-wise manner. We evaluate the performance of FRM using the TPC-H benchmark and show significant I/O improvement. In the context of two-dimensional datasets, we re-evaluate the optimal cost of range queries based on the current disk technology. We establish that it is mpossible to achieve the new optimal cost using disk devices, because the new optimal cost requires two-dimensional sequential access. Therefore we propose to use MEMS-based storage and design a new placement scheme, which almost achieves the new optimal cost.; Considering the I/O bottleneck purely from the perspective of an application, we propose methods for reducing the number of I/O transfers for aggregation ranking queries in OLAP data cubes. These queries aggregate information over a specified range and return the ranked order of the aggregated values. Existing techniques for range aggregate queries are not able to process aggregation ranking queries efficiently due to the large number of I/O transfers. We handle this problem through a new ranking algorithm, which reduces I/O cost dramatically.
Keywords/Search Tags:I/O, Cost, New, Mems-based storage
Related items