Font Size: a A A

An intelligent cache manager in data warehousing environment and its application to the Web caching

Posted on:1999-12-13Degree:Ph.DType:Thesis
University:Northwestern UniversityCandidate:Shim, JunhoFull Text:PDF
GTID:2468390014967938Subject:Computer Science
Abstract/Summary:PDF Full Text Request
A data warehouse is a stand-alone repository of information consisting of “interesting” and “historic” data from several, heterogeneous, operational databases, and the size of data warehouse is very large and grows over time. Data warehouses are usually dedicated to the processing of queries issued by decision support systems (DSS). The response time of DSS queries is typically several orders of magnitude higher than the response time of OLTP (OnLine Transaction Processing) queries. Since DSS queries are often submitted interactively, techniques for reducing their response time are important.; The caching of query results is one such technique particularly well suited to the DSS environment. In this thesis, we present an intelligent cache manager for such an environment. The cache manager can lookup queries either based on an exact query match or using a query split algorithm to efficiently find query results which subsume the submitted query. The cache manager dynamically maintains the cache content by deciding whether a new query result should be admitted to the cache and if so, which query results should be evicted from the cache. The decisions are aimed at minimizing query response time. The decisions are based on the execution cost of each query, the size of each query result, the reference frequency to each result, the cost of maintenance of each result due to updates of the base tables, and the frequency of updates. Experimental evaluation shows that the manager can significantly improve performance when compared to similar systems.; Since Web documents vary in their size, and the cost of their materialization depends upon the network delays, a profit based cache replacement algorithm can be applied to Web caching. At the same time, the cache must guarantee some form of consistency of the cached documents. Cache consistency algorithms enforce appropriate guarantees about the staleness of the cached documents. We have developed a unified cache maintenance algorithm which integrates both cache replacement and consistency algorithms. A trace-driven experimental study shows that the unified algorithm not only improves the average response time but also reduces the significant number of stale documents returned to the clients.
Keywords/Search Tags:Cache, Data, Response time, Query, Web, Environment, Documents, DSS
PDF Full Text Request
Related items