Font Size: a A A

Bitmap Index As Effective Indexing For Low Cardinality Columns In Data Warehouse

Posted on:2014-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:N Z a i n a b Q a y s A b Full Text:PDF
GTID:2298330431499360Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speeding up query processing is a sensitive issue in the data warehouse and Business Intelligence (BI) solutions. Using some mechanisms like indexes can achieve this goal with efficient performance without extra hardware. The challenge is to find a suitable indexing method type that would enhance the query’s performance.Low cardinality data appears in data warehouse widely. Cardinality is the number of values in columns in a dataset. The column which has a repetition in value defined as a low cardinality column. One example of this is a gender column which has just two values either male or female.B-tree index sort the data storage and allow toO(logn) running time to find a tree data structure, it considered as poor choice for indexing low cardinality columns. Some relational database management systems adopt new indexing techniques, such us, bitmap indexing, to speed up inquiry processing. Because the particular structure of bitmap index rustle in quick data retrieval. The structure of bitmap indexes mainly uses a string of binary numbers (0or1) to indicate whether the indexed column in a table is equal to a specific value that is, bitmap index set to ’1’ if the specific value exists in the indexed column otherwise’0’.The aim of this thesis is proving that the bitmap index more efficient than B-tree index in data warehouse especially when we have low cardinality columns, using Oracle environment which uses B-tree as default indexing.We tested the two indexing techniques as B-tree and bitmap index on the basis of the time taken for execution by making three experiments. We tested SQL query for three conditions, two conditions, and one condition for four groups of records300000,600000,800000, and1000000respectively. The results of experiment show that bitmap index take less execution time than B-tree index particularly when we have low cardinality columns. The time and space complexity rules have been applied for bitmap index and B-tree index, Mathematical calculation demonstrates that the time and space complexity for building bitmap index are both smaller than B-trees index especially when we have low cardinality columns.Since bitmap index is considered as a promising index for large data warehouse in this time and in the future, so we look forward for more efficiency in the performance of bitmap index during the execution of SQL enquiry through proposed new approach combining basic bitmap index algorithm and range based encoding algorithm with B-tree index depending on parameters h and l, which calculated automatically, they have relationship with the number of tuples. In general, h can be1%, and l can be0.1%. In range based encoding algorithm which is a modification of the basic bitmap index, bitmap vector is used to represent a range of value rather than a separate value. The main advantages of this technique it can achieve less time complexity than basic bitmap index particularly when we deal with a range of values. Simple bitmap indexes are only efficient for attributes with a low number of distinct values, but for high cardinality attributes the time complexity is considerably high.In our proposed approach, we combine range based encoding algorithm with basic bitmap index algorithm with B-tree index to handle this weakens and at the same time to increase the efficiency of bitmap index. An advantage of our proposed approach is that it can deal with both low cardinality data and high cardinality data more effectively. With the proposed approach, the time complexity will be less than using range based encoding algorithm for low cardinality data in case Ct≤l(when the cardinality less than or equal l where l is parameter has relationship with number of tuples in general l can be0.1%.), and it will be less than using basic bitmap index in case Ct> when the cardinality greater than l. At worst case the time complexity of our proposed approach will be the same as B-tree index if all the data is high cardinality, and it will be the same as basic bitmap index if Ct≤l, and the same of rang based bitmap index if Ct>l in case of low cardinality.
Keywords/Search Tags:Data warehouse, Bitmap index, B-tree index, Range encoding index, Cardinality, indexing
PDF Full Text Request
Related items