Research And Implementation Of Compression Technology In Column-Oriented Data Warehouse

Posted on:2014-01-31

Degree:Master

Type:Thesis

Country:China

Candidate:B T Long

Full Text:PDF

GTID:2248330395980924

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

As information has become one of the key aspects of enterprise survival and development, it is more significant to extract and analyze information from huge amounts of data to support decision-making. Data warehouse as an important analysis tool for massive data arouses more attention.Nowadays, the traditional row-oriented database management systems have been unable to adapt to the efficient analytic queries. The column-oriented database storage architecture receives more attention. Under the application environments such as analytical query in data warehouse or business intelligence, column-oriented database storage architecture can avoid reading irrelevant columns during query execution, which has more advantages than row-oriented database.Disk I/O is the main bottleneck during the data query in data warehouse which will has high time cost. Reducing the amount of I/O can improve the efficiency of the data query significantly. Column-store technology which stores data with same data type increases the similarity between the adjacent data. Therefore, data warehouse using column-store technology has better data compression efficiency than the one using traditional row-store. So, data compression is one of most important topics in the column-oriented data warehouse management system.Based on characteristics of the column-oriented data warehouse management system, this paper provides the design and implementation of the compression model; provides the design and implementation of the decompression and the execution on compression data scheme in column-oriented data warehouse management system. Then it proposes an improved version of the classic data compression algorithm, which is the simple-dictionary encoding based on dynamic dictionary. The method provided in this paper combines column-level dictionary with sector-level dictionary and counts the probability of occurrence of every data value in each sector, which supports the establishment of streamlined lightweight column-level dictionary. So the compression ratio and the query performance are improved. At last, the experimental results given are used to verify the effectiveness of the proposed method on the data warehouse benchmark data set SSB.

Keywords/Search Tags:

data warehouse, column stote, data compression

PDF Full Text Request

Related items

1	Research And Optimization Of Multidimensional Data Warehouse Model Based On Column Storage
2	Research And Implementation Of Data Reusing Strategy In Column-store Data Warehouse
3	Design And Implementation Of Data Dictionaries In Column Storage DWMS
4	Research On Non - Decompression Algebra Operation Algorithm On Compressed Data
5	Compression Algorithm Based On Support Columns Stored Data
6	Research And Implementation Of Data Compression Based On Column-Oriented Database System
7	Research And Implementation Of The Bitmap Index In Column-Oriented Data Warehouse
8	Research On Query Optimization In Column-Oriented Data Warehouse
9	Research And Implementation Of DWMS Compression Technology
10	Research And Implementation Of Query Execution In Column-Stored Data Warehouse Management System