Font Size: a A A

The Design And Implementation Of Full Text Index For HBase Based On Lucene

Posted on:2014-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:M H ZouFull Text:PDF
GTID:2248330395495253Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid growth of data, data warehouse and data mining are paid more attention. HBase, as a popular NoSQL product, is accepted by many users now. But the way to search record in HBase seems frustrating in some user cases. All data have to be scaned in some situations.Index is needed, either secondary index or full-text index, to made searching more efficient. We tried to add the full-text index feature to HBase recently.To implement the full-text index feature, the Lucene library is chosen to do the indexing and searching work. The index data are stored on HDFS. We add the full-text index related code based on the HBase coprocessor framework which can help to make little change in HBase. In our implemention index will be built along with the data inserted into HBase and multi-keyword search is supported. Also, common word segmentation algorithms are integrated.Index can be built as expected when the data is inserted into HBase in our tests. The response time of search query is acceptable. But the design seems not suitable for high concurrency situations.
Keywords/Search Tags:HDFS, HBase, HBase Coprocessor, Lucene
PDF Full Text Request
Related items