Font Size: a A A

Techniques Research For Data Cube Compression

Posted on:2011-01-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:F Q ChenFull Text:PDF
GTID:1118360302973754Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the rapid development of business intelligence and decision support technologies, the data warehouse applications are spreading to cover more and more fields. Data cube is a kind of material views in data warehouses, it helps the system to shorten response time to queries, but the huge volume of itself brings forward a lot of problems, including high storage cost and low maintenance efficiency. How to compress data cubes is an important study field in recent years.In this dissertation, we study many kinds of data compression techniques, mainly including cover quotient cube, closed cube, QC-table and iceberg quotient cube. We proposed some new methods to improve the efficiency of current algorithms. The dissertation mainly focus on the issues of storage, index and queries answering in the filed of data cube compression. We study on the following techniques.(1) We study many issues related to cover quotient cube, which has the advantage of file size in the current data cube compression techniques. To build a good theoretic base, we analyze some import characters of cover quotient cube, and propose some essential concepts. Based on these analysis and concepts, we bring forward two new methods to generate cover quotient cube. The algorithms to response point queries, range queries and drill-down/roll-up queries are provided. We put forward a new bitmap index technique named qcbit index, which uses bitmap file as the index file, so the file size is very small. Qcbit index can help the system to locate targeted upper bound quickly when answering queries, the index represents the attributes' values itself. Using qcbit index the OLAP system can answering queries in much shorter time with much less disk I/O operations. The dissertation proposes new methods to compress value-list index files, amends the run-length approach to get a higher compressing ratio. Approaches of storing cover quotient cubes in parallel environment and single node are also studied, strategies of choosing hard-ware structure and declustering are provided.(2) This dissertation studies some other data cube compression techniques, including closed cube, ice quotient cube and QC-table. For closed cubes, we propose the concept of closed mask, based on this concept, we partition the closed cube to some subsets, and visit only some of the subsets to decrease the disk I/O operations, thus to answer queries in less time. For iceberg quotient cubes, we mainly study the approaches to answer queries, propose some concepts and analyze the characters of this kind of cube, give both special query answering algorithms for some kinds of ice quotient cubes and general query algorithms. For QC-table, we point out the signality of study, then propose the approaches to compress it farther and algorithms to answer queries.(3) The dissertation proposes the technique of condensed quotient cube, which combines the advantages of cover quotient cube and condensed cube. A condensed quotient cube is a subset of the original cover quotient cube, it has a smaller file size, thus answer queries more efficiently. The correlative data structure and query algorithms are also brought forward.Finally, we give a summary of the researches in this dissertation and point out the future study fields.
Keywords/Search Tags:Data cube, Cover quotient cube, Condensed cube, QC-table, QCbit index, Iceberg quotient cube
PDF Full Text Request
Related items