Font Size: a A A

The Research Of Query Optimization Based On Bitmap Indices For Data Warehouse

Posted on:2011-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:X ChengFull Text:PDF
GTID:2178330332458100Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Index is an important query optimization technique for data warehouse, which including tree indexes and bitmap indexes. Bitmap index is widely used in the data warehouse because of its simple structure and its high efficiency of logic operation. Bitmap index has proven to be very efficient with low attribute cardinalities(the number of distinct values). However, for high-cardinality attributes, bitmap index requires too much storage. Therefore, bitmap index is often think to be unsuitable for high-cardinality attributes.Researchers have provided a lot of strategies to solve this problem, including coding, compression, bin. Bitmap index with bin can effectively reduce the storage required on high-cardinality attributes. This technique built a bitmap for each bin(attribute range) rather than distinct values. But it also introduces another problem,which is called candidate check.Candidate check usually dominates the total query processing time.Using traditional multi-dimensional query algorithm,query the various attributes of a different order, will be significantly affected the total number of candidate check. This paper presents two theorems,which prove that two factors affect the sorting. And accordingly, we presents a dynamic sorting algorithm to improve query performance. The algorithm through sorting the attributes to reduce the number of candidate check. Theoretical analysis and experimental results show that this sorting algorithm can significantly reduce the total number of candidate check, and optimized the traditional multi-dimensional query algorithm.But the dynamic sorting algorithm can not reduce the number of candidate check needed of the first dimension.Experiments show that it always over the half of total number of candidate check.The number of candidate check can be effectively reduced by the pre-scan (delayed candidate check).But the pre-scanning requires additional cost,it requires more operations on bitmaps.And the cost can not be ignored. Taking into account after a certain dimensions, to still pre-scan will not significantly reduce the total number of candidate check. The paper propose a dynamic pre-scan algorithm to determine what number of attributes and which attributes should be pre-scan dynamically to improve query efficiency. Theoretical analysis and experimental results show that dynamic pre-scan algorithms worked well.
Keywords/Search Tags:Data Warehouse, Bitmap index, Binning, Multi-Dimensional Queries, Encoding, Compression
PDF Full Text Request
Related items