Font Size: a A A

Design And Implementation Of Complex Conditions System Based On The HBase

Posted on:2018-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:X W WangFull Text:PDF
GTID:2428330512966954Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
We lived in the Internet and information age.Every day,there are massive information around us.With the explosive growth of information,the traditional relational database has been unable to meet the large amount of data processing.In this trend,the growth of data brings our traditional business website or the original function of the information system the disastrous consequences.The consequences are not only due to the expansion of the scale of the data,but also related to the change of the data structure.Compare with the original relational database structured data,most of the data needed to deal with is unstructured.Now the relational database not only has no ability to deal with the non-structured data,but also can not effectively deal with the massive data.At this time we need to introduce a new database system for dealing with large scale and unstructured data.And then gave birth to a lot of non relational database,HBase is a kind of non relational database to deal with massive data.It is for a large-scale,distributed Hadoop platform,faced column and open source database.it has a unique advantage in a large,unstructured data processing of the information age,but also have congenital deficiencies and shortcomings.First of all,HBase only provides the way through the RowKey key to retrieve data,for the user is to shield the content of the data in the specific data,Users can only through the key to find the value of the database,but can not be based on the value of the data to filter the data,in the use of the process has more inconvenience.Moreover,the HBase database also gives up the transaction characteristics of relational database,the two level index mechanism and the use of structured query language SQL statement to retrieve data and so on.In many applications,there is a need to retrieve data according to specific information content specific needs.At the same time,due to the complexity of the data itself and the uncertainty of the structure,as well as the growth of data in the current system,people's demand for the speed and accuracy of the system is constantly improved.This is generally based on the needs of developers to HBase while maintaining high performance advantage to complex queries is given some support,but this topic will be introduced in the current HBase is a non-invasive mechanism to achieve the query system of high performance complex conditions two columns based on index.On the basis of HBase,this paper designs a high performance system which can satisfy the query of complex condition.The HBase system keeps the original characteristics,improved usability and real-time,and increased support for the SQL language,make the system more easy to use,and in order to support the SQL language to query the data and ensure the query efficiency,the establishment of two level index,for the real-time data query.The SQL sentences which users input will go through the SQL conversion engine for SQL parsing,then parse out the fields and keywords,take them go through the query planner for planning different types of SQL statement processing process and transfer them into HBase API.In this paper,we use ANTLR as the SQL statement parser.In order to improve the efficiency of the query,build the two level index on data,and use the Coprocessor framework to develop the property conditions to delete function and propertyconditions update function,also use the Coprocessor intercept the put,delete operate of Region and other operations,generation real-time index.At the same time,in order to guarantee the consistency of index data,this paper also provides the function of generating index based on MapReduce framework.Finally,the system is tested to verify the function and performance of the system,and compare with different experimental conditions.And the experimental results,the system constructed in this paper can support the query of SQL statements,and provide a good performance,will not lose the original performance of HBase based on the complex conditions of the query.
Keywords/Search Tags:Big data, HBase, SQL parser, Two level index
PDF Full Text Request
Related items