Font Size: a A A

Study On Pre-Aggragated Consistency Maintenance Strategy In Distributed Data Warehouse

Posted on:2010-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:K LiuFull Text:PDF
GTID:2178360272485304Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the extending application range of data warehouse, the central data warehouse environment has not satisfied the user's requirement. The technology of distributed data warehouse emerges to meet this requirement. In data warehouses, most of the data come from bottom service databases which are distributed, heterogeneous and self-governing and the information in the database is stored as the form of materialized views. Pre-aggregating data is an important part of these materialized views. Existing of pre-aggregating data accelerates query responding speed and improves the performance of data warehouses greatly. However, the data in bottom databases is changing continuously. it is necessary to timely maintain for them in order to make the pre-aggregated data in distributed data warehouses synchronously reflecting the changes of data resources.After the introducing the knowledge of distributed data warehouse and pre- aggregated data, the deficiency of on-line maintenance and off-line maintenance are analyzed; the application range of dynamic incremental maintenance algorithm is extended, and the model and algorithm of dynamic incremental maintenance in distributed data warehouse is proposed. Through setting two level views in each distributed node, the method implementing the incremental maintenance of views is proved to have better effect.As one of the organizing forms of pre-aggregated data——data cube is multi-dimensional extension of the two-dimensional table. The process of data aggregating is the process of data cube materializing, so the maintenance of aggregated data can be transformed into the maintenance of data cube. The consistency maintenance of pre-aggregated data depends on the organizing form of data cube. Based on the introducing multi-dimensional structure and data cube, dimension layer information are inducted to materialize is carried out, the cube storage by partition of the dimension, at the same time, section division of data cube, then clustered by theme, the store structure of data cubes which is applied to distributed system— —distributed data cube is formed. The structure makes the data cube distribute on each node of the distributed data warehouse by theme. The efficiency of data maintenance is improved by using the algorithm of incremental maintenance and the method of distributed parallel process for data maintenance,In order to test the function and efficiency of the algorithm used in central system in the distributed environment. The distributed simulation system has been established and aggregated data has been stored in the form of distributed data cube, and test results showed that the algorithm has better effect in consistency maintenance of pre-aggregated data.
Keywords/Search Tags:distributed data warehouse, pre-aggregated data, dynamic incremental maintenance, distributed data cube
PDF Full Text Request
Related items