Hbase Non-primary Key Attribute Index Method And Implementation

Posted on:2017-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Huang

Full Text:PDF

GTID:2348330503495670

Subject:Management Science and Engineering

Abstract/Summary:

PDF Full Text Request

Today, the internet is moving towards the further development of the mobile internet, social networks and other new contents are constantly emerging,people can easily get the information they want. As with the continuous development of the above new contents, the form of business becomes diverse and the data grows exponentially everyday. Big data are of immeasurable value, the relat ionship between data also plays an important role in business operations and decision-making. So, it is urgent to study and search efficient storage methods of big data to provide real-time or near-real-time query capability for big data. However, in the context of big data and highly concurrent data requests, the traditional relat ional database sicks in bottleneck, which makes it unable to meet demand. To solve the problem, HBase emerges as one of the typical NOSQL.HBase provides an efficient technology and platform for storing and querying big data. Though HBase provides efficient retrieval of pr imary keys, however, its support for non-primary key attribute retrieval is not very satisfactory, which leads to the low efficiency of HBase's non-primary key attribute query, and makes it difficult to meet the requirements of real-time or near real-t ime query.So,to provide high performance queries for non-primary key attribute on HBase is an important issue which is in badly need of studying and solving.By further studying the non-primary key attribute index method of HBase, this paper proposed a hierarchical index of which the index storage model was divided into two layers. First, it was the first layer index, achieving the first layer index based on HBase regional observer pattern coprocessor using the index structure of the improved d-left Counting Bloom Filter, which could avoid unnecessary data scanning and comparison to quickly locate the regions storing the email data to be found. Second, it was the second layer index, achieving the second layer index based on HBase regional observer pattern coprocessor using the index structure of inverted index, which could go through the relevant regions located by the first layer index to continue to query the target email data. Finally, this paper implemented the hierarchical index and the experiment al results demonstrate that the hierarchical index can effectively meet the query requirements of non-primary key attribute in mass mail analysis practice.

Keywords/Search Tags:

Big Data, HBase, Non-primary Key Attribute, Hierarchical Index, d-left Counting Bloom Filter, Coprocessor

PDF Full Text Request

Related items

1	Design And Implementation Of HBase Hierarchical Auxiliary Index System
2	Research On Retrieval Speed Improvement Of HBase Based On Coprocessor Mechanism
3	The Design And Implementation Of Full Text Index For HBase Based On Lucene
4	OBF-Index:A Distributed Multi-Dimensional Index Based On Ordinal Bloom Filter
5	Multi-Bloom-Filter Query Algorithms And Their Applications
6	Research On HBase-based Mass Image Storage And Fast Retrieval Technology
7	Research And Application Of Bloom Filter In Duplicated Webpages Deletion
8	The Design And Implementation Of A Scalable Counting Bloom Filter
9	The Research And Implementation Of Indexing And Query Techniques Based On HBase And In-memory Database
10	Research On Cipher-Text Retrieval Technology With Support Of Result Sorting In Cloud Storage