Font Size: a A A

Database Logging For Scalable Transaction Processing

Posted on:2020-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:H ZhouFull Text:PDF
GTID:1368330596967753Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Since the 1970 s,relational database management systems have been widely used in many fields,such as finance,transportation and communications,to efficiently organize and manage core data.To ensure that data is not lost in the event of software and hardware failures,database systems typically rely on the use of a database log.The database log is a sequential file that stores information about transactions and the state of the system at certain instances.Each entry in the database log is called a log record which is assigned with a unique and monotonically increasing log sequence number(LSN).To ensure the reliability and availability,traditional database systems use the ARIES transaction logging to write log records into a central log buffer and flush them into disk,and then use the log replication to transfer log records to the remote backups.However,with the emergence of multi-core and large memory,centralized design,serial execution,sequential constraint and IO operation of traditional database logging have become the main performance bottlenecks of scalable on-line transaction processing systems.To this end,this paper implements new transaction loggings and log replication to achieve high-performance and scalability of database systems.The main contributions are summarized as follows:1.For the centralized log buffer contention and fixed group commit of traditional transaction logging,this paper presents a scalable and load-adaptive transaction logging(Laser).It uses a LSN calculation based on atomic instructions and a parallel log insertion to improve the scalability of database systems.In order to get the lowest commit latency of transactions in changing workloads,it proposes a load-adaptive group commit protocol to dynamically determine an optimized group time for different workloads.In addition,this paper implements Laser in the opensource distributed database system CEDAR and evaluates its performance.2.For the limited disk bandwidth of traditional transaction logging,this paper proposes a parallel transaction logging on scalable storage devices.It uses multiple log buffers and disks instead of the centralized design.To enable parallel logging,it proposes a global sequence number(GSN)that provides a partial order of log records and a persistent group commit protocol.To further alleviate the performance overheads caused by log partitioning,it proposes a workload-aware log partitioning to minimize the number of cross-partition transactions,while maintaining load balance.In addition,this paper implements an in-memory transaction engine(Plover)with parallel logging,optimistic concurrency control protocol and parallel recovery.Experiments demonstrate its parallelism and scalability.3.For the sequential constant of traditional transaction logging,this paper proposes a recoverable and partial transaction logging(Poplar).It defines recoverability for transaction logging and demonstrates its correctness for crash recovery.Based on recoverability,it enables log records to persist on multiple disks in parallel,proposes a scalable log sequence number(SSN)to track RAW and WAW dependencies between transactions,and implements a speedy transaction commit protocol.In addition,this paper implements Poplar in the open-source in-memory database system DBx1000 and compares it with other transaction loggings.4.For the limited network bandwidth of traditional log replication,this paper proposes an adaptive log replication for hot standby systems.It uses an adaptive shipping method to reduce the network traffic and avoid the network becoming a bottleneck under heavy workload.In addition,this paper implements a highperformance in-memory replication system with the adaptive log replication and parallel transaction logging.It uses a segment-based algorithm and a segment-based replay to ensure the consistency between master and backups.Experiments verifies its effectiveness.In summary,transaction logging and log replication are essential constituents to guarantee the reliability and availability of database systems.However,traditional database logging becomes a major performance bottleneck in the multi-core and large-scale memory platforms.There are four key problems:(1)centralized log buffer contention and fixed group commit protocol;(2)limited disk bandwidth;(3)sequential constraint;(4)limited network bandwidth.In order to solve these bottlenecks,this paper launches four researches and proposes four databases logging,e.g.,a scalable and adaptive transaction logging,a parallel transaction logging on scalable storage devices,a recoverable and partial transaction logging and an adaptive log replication for hot standby systems.Finally,the experimental results show that new database loggings enable scalable,high-throughput and low-latency transaction processing.
Keywords/Search Tags:Database Management System, Transaction Processing, Transaction Logging, Log Replication, Parallel Logging
PDF Full Text Request
Related items