Font Size: a A A

Research On Bitmap Index Technology And Application For Massive Data

Posted on:2016-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:2348330476455739Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, large amount of data have swarmed into the net with different kinds of the data type. And the era of big data is coming. How to rapidly and precisely find the data that satisfies our demands in the huge data center? It's a question which has great significance and value. In many data index technology, bitmap index has been used in many occasion for its less storage space and quick search. And in order to meet the demand of the massive data in storage and query, this thesis proposed Sliced-Interval bitmap index technology. Based on the Sliced-Interval bitmap index, this thesis proposed the method to optimize the membership query as well. At last, this thesis designed a bitmap appliance architecture for massive data. The detailed description of this thesis is as follows:1) Combining the characteristic of massive data and bitmap index, proposed Sliced-Interval bitmap index.In order to save the index storage of massive data furthermore, this thesis combined the Simple bitmap index, Interval bitmap index and Sliced bitmap index and make a newly build bitmap index which called Sliced-Interval bitmap index. This bitmap index structure can save the storage space largely, and it's helpful to make membership query.2) In order to improve the search speed, this thesis analyzed the user's query object and used the Apriori arithmetic mining the association the multi-Value of a certain column. Combining the Sliced-Interval and the association the mined, we can optimize the membership query request, and improve membership query efficiency.3) proposed the bitmap appliance architecture for massive dataThis thesis proposed an index structure named “two-level bitmap index”. This architecture consists of two part index. On is called global data Meta index, and the other is called local data distributed index. Firstly, in order to accelerate the Meta index access speed, we create bitmap index on the Meta index. Besides, the thesis used PA arithmetic produced the Meta index, and in order to make the data can be distributed into the distributed storage node more reasonable, we optimized and improved the Meta data index. When we used two-level bitmap index for data search, it can avoid scan all distributed storage node and thus improve the search efficiency. At last, in order to make this architecture more reliable, we used loading balance strategies in global node.4) At last part of the thesis, we designed a test scheme. Firstly, we test different bitmaps' searching time cost on equality query, range query and membership query. Secondly, we test the high concurrency's efficiency to the Meta index which adopted by loading balance strategy. Thirdly, we test the read-efficiency when the distributed nodes are expanded. At last, we analyzed the test result and make some conclusions.The innovation of this thesis includes the following two points:1) proposed Sliced-Interval bitmap index structure, and optimized the membership query on this structure.2) proposed two-level bitmap index structure which is suitable for massive data's distributed storage and query, and analyzed its advantages and disadvantages.
Keywords/Search Tags:Massive Data, Bitmap Index, the Apriori Arithmetic, Distributed Storage, Query
PDF Full Text Request
Related items