Font Size: a A A

The Research Of High Available Data Warehouse Based On Failure Recovery

Posted on:2014-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z PengFull Text:PDF
GTID:2248330395480707Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The paper introduces the development status of the high availability of the data warehouse,analyses the shortcomings of traditional data warehouse failure recovery mechanisms, studies thekey technologies of the fragments in the delivery of data warehouse design based on thethree-phase optimized structure and delivery of treatment, proposes a failure recovery algorithm--REVIVAL in a distributed data warehouse.To evaluate REVIVAL’s feasibility, I compare REVIVAL’s runtime overhead and recoveryperformance with those of two-phase commit and ARIES, the gold standard for log-basedrecovery, on a four-node distributed database system that I have implemented. My experimentsshow that REVIVAL incurs lower runtime overhead because it does not require log writes to beforced to disk during transaction commit. Furthermore, they indicate that REVIVAL’s recoveryperformance is comparable to ARIES’s performance on many workloads and even surpasses iton characteristic warehouse workloads with few updates to historical data. The results are highlyencouraging and suggest that my integrated approach is quite tenable.Any highly available data warehouse will use some form of data replication to ensure that itcan continue to service queries despite machine failures. In this thesis, I demonstrate that it ispossible to leverage the data replication available in these environments to build a simple yetefficient crash recovery mechanism that revives a crashed site by querying remote replicas formissing updates. My new integrated approach to recovery and high availability, called REVIVAL,targets updatable data warehouses and offers an attractive alternative to the widely usedlog-based crash recovery algorithms found in existing database systems. Aside from itssimplicity over log-based approaches, REVIVAL also avoids the runtime overhead ofmaintaining an on-disk log, accomplishes recovery without quiescing the system, allowsreplicated data to be stored in non-identical formats, and supports the parallel recovery ofmultiple sites and database objects.
Keywords/Search Tags:Failure Recovery, High Availability, Data Warehouse
PDF Full Text Request
Related items