Font Size: a A A

Hbase Non-primary Key Attribute Index Method And Implementation

Posted on:2017-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:C HuangFull Text:PDF
GTID:2348330503495670Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Today, the internet is moving towards the further development of the mobile internet, social networks and other new contents are constantly emerging,people can easily get the information they want. As with the continuous development of the above new contents, the form of business becomes diverse and the data grows exponentially everyday. Big data are of immeasurable value, the relat ionship between data also plays an important role in business operations and decision-making. So, it is urgent to study and search efficient storage methods of big data to provide real-time or near-real-time query capability for big data. However, in the context of big data and highly concurrent data requests, the traditional relat ional database sicks in bottleneck, which makes it unable to meet demand. To solve the problem, HBase emerges as one of the typical NOSQL.HBase provides an efficient technology and platform for storing and querying big data. Though HBase provides efficient retrieval of pr imary keys, however, its support for non-primary key attribute retrieval is not very satisfactory, which leads to the low efficiency of HBase's non-primary key attribute query, and makes it difficult to meet the requirements of real-time or near real-t ime query.So,to provide high performance queries for non-primary key attribute on HBase is an important issue which is in badly need of studying and solving.By further studying the non-primary key attribute index method of HBase, this paper proposed a hierarchical index of which the index storage model was divided into two layers. First, it was the first layer index, achieving the first layer index based on HBase regional observer pattern coprocessor using the index structure of the improved d-left Counting Bloom Filter, which could avoid unnecessary data scanning and comparison to quickly locate the regions storing the email data to be found. Second, it was the second layer index, achieving the second layer index based on HBase regional observer pattern coprocessor using the index structure of inverted index, which could go through the relevant regions located by the first layer index to continue to query the target email data. Finally, this paper implemented the hierarchical index and the experiment al results demonstrate that the hierarchical index can effectively meet the query requirements of non-primary key attribute in mass mail analysis practice.
Keywords/Search Tags:Big Data, HBase, Non-primary Key Attribute, Hierarchical Index, d-left Counting Bloom Filter, Coprocessor
PDF Full Text Request
Related items