Font Size: a A A

The Research On Materialized View Selection And Maintenance In Data Warehouse

Posted on:2005-04-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:L J ZhouFull Text:PDF
GTID:1118360125470655Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the needs of decision-support information of enterprise and the fast development of computer technologies, data warehouse technology come out. The architecture design of data warehouse is one of the core research problems in studying and evolution of data warehouse.The data warehouse is a repository of information collected from multiple, possibly heterogeneous, autonomous, distributed databases. The information stored at the data warehouse is in form of views, referred to as materialized views. The query responding time can be speeded by pre-storing. The performance of the data warehouse has been improved by using and studying materialized views.The following aspects are focused in the dissertation: ?(1) The selection of the materialized views is one of the most important decisions in designing a data warehouse. Materialized views are stored in the data warehouse for the purpose of efficiently implementing on-line analytical processing queries. The first issue for the user to consider is query response time. So, the query cost view selection problem is proposed. In order to solve it, the view selection cost graph and its construction method are put forward. In the meantime, the cost model of the query cost view selection problem and the process of materialized views selection are presented.(2) Concerning to the above cost model, the method and strategy based on greedy algorithm for selecting materialized views dynamically is proposed . The number of materialized views is confirmed artificially in original greedy algorithm. The hypothetical value of "k" will not get a satisfied result and affect the efficiency of OLAP. The value of "k" is attained dynamically according to the minimizing maintenance cost under given query cost in the dissertation.(3) The methods for selecting materialized views by using random algorithms are presented. First, the genetic algorithm is applied to the materialized views selection problem(GA_VSP), hence, the representation of the solution is presented. The solution of problem should be converted into a binary string in terms of given view selection cost graph. Genetic operation is presented, and fitness function is defined.With the development of genetic process, the legal solution produced become more and more difficult, so a lot of solutions are eliminated and producing time of the solutions is lengthened, which adds difficulty to the solution in GAJVSP algorithm. Therefore, improved algorithm (SAGA_VSP) has been presented, which is the combination of simulated annealing algorithm and genetic algorithm' for the purpose of solving the query cost view selection problem. Bymeans of inherit rules, the designs of selecting materialized views are produced and can be accepted or not by simulated annealing algorithm in improved algorithm. Therefore, the space of the solutions is expanded further, and the variety of the solutions is kept, the difficulty in producing solution is decreased so that near-optimal solutions can be found easily.In order to test the function and efficiency of the algorithms of materialized views selection, experiment simulation is adopted. The experiments show that the given methods can provide near-optimal solutions in limited time. The results also show that to the query cost view selection problem, the GA_VSP algorithm is better than the greedy algorithm, and the improved algorithm (SAGA_VSP) is better than GAJVSP algorithm. Randomized algorithms will become invaluable tools for data warehouse evolution.(4) There are two methods for materialized views maintenance in data warehouse, which are recomputing views and incremental maintenance. Incremental maitenence techniques are adopted in the dissertation. The amount of incremental data is different to the same view which adopts different methods. This incurs different maintenance cost. The idea and strategy of minimizing incremental maintenance is presented .The materialized view definitions and maintenance express, as well as algorithm are given. The mainten...
Keywords/Search Tags:data warehouse, On-Line Analytical Processing, algorithm, materialized view selection, materiali/fd view maintenance
PDF Full Text Request
Related items