Font Size: a A A

Real View Of Re-calculation Algorithm And Implementation

Posted on:2003-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZuoFull Text:PDF
GTID:2208360062980738Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data warehouses (DW) are built by gathering information from distributed information sources (ISs) and integrating it into one customized repository. Data warehousing is used for responding to high level query and analysis (e.g., decision support, data mining). A materialized view of the system is kept at a site called the data warehouse, and user's queries are processed using this view. Once a DW is established, the problem of maintaining it consistent with the underlying ISs under updates becomes a critical issue. Currently, there are two main approaches: incremental materialized view maintenance and view self-maintenance. The latter is one way to maintain the materialized view at the DW without access to the base relations by replicating all or parts of the base data at the DW. However, the materialized view must obtain some necessary information from ISs. As more and more data is added to the DW, it increases the space complexity and gives rise to information redundancy which might lead to an inconsistent DW extent. In addition, not all views are self-maintainable. Therefore, the practical way is to update the materialized view by maintaining incrementally. Incremental materialized view maintenance is to materialize related subset of base relations as the real view and to deal with user queries based on this local data. After receiving the updates from ISs, the materialized views are re-evaluated incrementally by use of the algorithm such as EGA, Strobe and OLEC etc. OLEC algorithm overcomes the disadvantages of EGA and Strobe without additional local compensation and DW no need to be static when incremental view commits.Our goal is to construct a general DW engine in which the existing MIS system and distributed database system can embed, and to improve it as a DW system which can be used to develop high level data application at the lowest cost. In this thesis, we propose a new algorithm called PMDVM for view maintenance and its consistency at the DW. This algorithm can be implemented by CORBA and Java. We add parallel view maintenance under concurrent data updates to OLEC, improve on WHIPS model, and incorporate several algorithms such as relevant detecting and self-maintenance view. When an update is received, PMDVM detects its relevance firstly. If irrelevant,then discards immediately. If relevant, then checks whether it is concurrent incremental maintainable or self-maintainable. If the update satisfies self-maintenance, our mechanism queries auxiliary views (instead of querying ISs) to avoid network transmission and make the materialized view consistent with underlying ISs. The update which needs to maintain the view concurrently and incrementally can be handled by POLEC.
Keywords/Search Tags:data warehouse (DW), materialized view, re-evolution, PMDVM
PDF Full Text Request
Related items