Font Size: a A A

Research On Flash-based Storage Engine Of Relational DBMSs

Posted on:2012-08-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:C XuFull Text:PDF
GTID:1118330332475934Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The relational database management system (RDBMS) is the most prevalent data management software in the modern world. After the development in the past 20 years, RDBMS has become one of the most successful product evolved from the theory in the computer science. As providing the convenient interface of data management to users, the RDBMS also supports the transaction grammar, which encapsulates the data consistency and durability and releases the complexity of the application significantly.In the whole history, the non-volatile storage of the RDBMS architecture is the mag-netic disk from the beginning. Recently, the development of magnetic disk has reach the end, and the flash storage replaces the magnetic device in more and more computer systems. However, the 10 characteristics of flash storage is quite different from the mag-netic storage. First, as a electronic device with no mechanically disk arms, flash storage provides a much higher random access speed than the magnetic devices. Second, as flash chips need to erase a relatively large area before overwrite, the write and read performance of flash chips is asymmetric, specifically write operation is much slower than the read.Deploying the conventional DBMS on the flash disks fails to fully exploit its power. The reason is the structure and algorithm of the conventional DBMS is based on the characteristics of the magnetic disk. For example, the dirty page would be written into the permanent storage directly when evicted, and the large scale scan need to sequentialize the move of the disk arm. The structures and algorithms are unsuitable to the flash disks anymore. Therefore, the re-design of the RDBMS's storage engine is unavoidable.Under this requirement, we concentrate on the techniques about storage engine on flash disks in this paper, basing on the characteristics of the state-of-art flash disks and the access logics of the RDBMS applications. The main contributions of our work can be summarized as follows:·We present a complete framework called CRL, which is designed towards the charac- teristics of the flash disks in the storage and transactional level. The key techniques of CRL include the customized data compress algorithm, redo-log based version control and later materialization. The prototype of CRL improve the performance significantly on flash disks in the conventional OLTP applications.·We devise an index, named Update-Migration B+ tree(UM-B+ tree), which is tai-lored to the flash disk. The UM-B+ tree relieves the pressure of frequent random overwrites of the standard B+ tree and improves the I/O performance. We also expand the UM-B+ tree into the transaction environment, which is innovative to our knowledge. We improve the availability of the UM-B+ tree by discussing about the access control and the recovery mechanism in a high-concurrent circumstance.·We propose a new framework named Semi-Sharing Scan (S3), which is especially suitable for the large-scale concurrent scans on the disk-resident data. The S3 adapts a novel design which shares the readings between scans of similar speeds to save the bandwidth utilization. Meanwhile, it compensates the faster scans by executing random I/O requests separately, in order to reduce single scan latency. The experi-ment proves that S3 outperforms the conventional schemes in both bandwidth and CPU utilization.·We propose a practical hybrid model. In this model, the storage media of the database consists of both magnetic and flash. The data in the database are deployed on the suitable media according to their access patterns. We maintain a mapping table to translate the visit to the flash media. The page deployment of the system is tuned adaptively according to the physical parameter of the devices and the recent access patterns, which are collected via the sliding windows.
Keywords/Search Tags:Relational Database Management System, Flash disks, Storage engine Architecture, Index, Concurrent Scans, Hybrid Storage
PDF Full Text Request
Related items