Font Size: a A A

Research On Active Database Theory

Posted on:2006-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:W L ZuoFull Text:PDF
GTID:1118360182456867Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Active database management system (ADBMS) is a natural extension and enhancement of traditional passive database, with a salient feature of being reactive in response to event inside or outside of the DBMS system. The reactive nature of ADBMS is achieved by the definition and processing of active rules, which is also called ECA (event-condition-action) rules. When event occurs and condition holds, action will be executed automatically by the ADBMS system. Active DBMS can automatically and effectively maintain data integrity and consistency constrains, satisfying advanced and real-time database applications. The idea of active database can be traced back as early as the later 70's, in QBE, CODASYL, and System R. The sophisticated pioneer work in active database is HiPAC of the later 80's, in which the term active database is nailed. There has been a surge of research in active database area in the early and later 90's, with substantial research results published and more than 10 prototype active database systems developed. The most influential institutions contributed to the area include The CCA Company (HiPAC project), The University of Florida (Sentinel), University of Zurich (SAMOS), IBM Almaden Research Center (Satrburst), etc. Domestic institutions devoted to active database area include Wuhan University, Huazhong Poly-technical University, Fudan University, Zhejiang Yniversity, Shandong University, Harbin Poly-Technical University and Jilin University. Although quite a few active database prototype systems have been developed, and some simplified (restricted) active rules have been integrated into commercial DBMS in the form of triggers or assertions, it is too early to say that active DAMS is mature. This not only lead to remarkable divergence in active database semantics, but also restricted active database applications. As such, top-level international database conferences, such as VLDB2006, still solicit active database contributions as one of the main topics. The ultimate goal of this dissertation is to clarify problems inadequately or incompletely solved in the area, conduct an in-depth study in related topics towards the goal of establishing a complete and sound theoretical foundation of active database theory. The work reported in this dissertation is based on Petri net. Following a comprehensive survey of the state-of-the art in ADBMS, theoretical problems inadequately or incompletely solved in the area are identified one by one, which determined directions of our research efforts. Instead of building rule and prototype systems to start, the ongoing study is mainly devoted to theoretical problems, with emphasis on temporal event-rule language extension, incremental composite event detection, termination analysis, formal semantics of active rule execution supporting multiple coupling modes, rule scheduling and concurrency control of active rule execution in the framework of database transactions, deadlock detection and recovery algorithms. The main contributions and research results are of the following: (1) A comprehensive survey in active database research is reported, which covers active database history, development process, terminologies and techniques, as well as current research status. Our emphasis is placed on theoretical problems remain to be resolved, which constitute our further research background and foundation. (2) The syntax and semantics of event language is investigated. Three novel unary event operators, pre, at, post are proposed, which can be arbitrarily nested with other event operators such as sequential, or, and, not, etc to express any complex temporal relations. Based on this, temporal event algebra and temporal event language are defined, along with time semantics for any event expression, which extends traditional event language to temporal space, realizing the Stella Gatziu's hypothesis of regarding event as a specific point in time. Some good properties of the extended temporal event algebra are also proved. (3) Petri net is studied for adoption in temporal event detection. In accordance with event consumption semantics, token replace Petri net (Token Replace Petri net, TRPN) is defined. The extended TR Petri Net can note only detect the occurrence of composite event, but also forward the time when the component event occurs. This not only provides a quantitative basis for rule scheduling, but also initiates a new way for confluent rule execution. Because of the existence of auxiliary place in TR Petri Net, the algorithm is incremental in nature, efficient in implementation. (4) In the area of rule termination, we analyzed existing algorithm based on triggering and activation graphs, and draw conclusion that current algorithms are conservative. We first identified an incorrect result in static rule termination analysis based on triggering graph (TG) and activation graph (AG) in the presence of rule priorities. By introducingtriggering reachable (instead of reachable of TE and AE), we corrected the current analysis result. Furthermore, based on the observation that the execution of one rule may definitely falsify another rule's condition, a new type of graph, named deactivation graph (DG), is introduced. By integrating TG, AG and DG, a more generalized graph, named Relationship graph (RG) is defined. The most accurate termination analysis algorithm based on RG is thus proposed. Some non-termination rule set estimated by TG and AG can be judged as termination, avoiding false alarms. (5) In rule execution semantics aspect, we analyzed the incompleteness of current research results in the area: existing confluence algorithms only support deferred (not immediate as some paper claim) coupling mode, whereas execution models supporting multiple coupling modes (immediate, deferred, detached, and their combination) don't guarantee confluent execution result. In order to handle various coupling modes properly, we first introduced the concepts of transitive dependency (indirect dependency) and totally commute. It is first identified that whenever rule triggering involves rules with immediate or/and detached coupling mode, not only direct, but also indirect dependencies need to be declared, and thus relative priorities between indirectly conflicting rule pairs need to be specified. Based on the above, this thesis first established a formal correct rule execution semantics in the form of sequential rule processing, and proved that any correct rule execution leads to the same final database state (confluent). (6) In rule execution model, we put forward the notion of hierarchical serializability (nested serializability), and proved that nested serializability is equivalent to correct rule execution. Under the framework of extended nested database transactions, a confluent rule scheduling and concurrency control algorithm supporting multiple coupling modes with maximal parallelism is established, along with efficient deadlock detection and recovery algorithms based on transaction tree and transaction forest. (7) A framework of active database system Petri-AOODBMS is proposed. The system integrates with an open OODBMS system. Active components and their interaction with underlying database system are explained. Most theoretical algorithms proposed in this dissertation are tested on the prototype system. It is expected that the present work consummate active database theory, constituting a significant step forward towards a complete and sound active database theory for further enrichment in commercial database management systems.
Keywords/Search Tags:Active database, trigger, ECA rule, event, condition, action, coupling mode, immediate coupling, deferred coupling, detached coupling, temporal event, temporal operator, temporal event algebra, event consumption semantics, time semantics
PDF Full Text Request
Related items