Font Size: a A A

The Big Data Non-invasive Secondary Index Research Of Environment Air Quality Monitoring

Posted on:2018-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2348330542970292Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Traditional relational databases have many drawbacks in dealing with massive amounts of explosive growth,such as poor scalability,poor fault tolerance,and unreliability.Instead,the big data techniques with its high fault tolerance,high scalability,high performance,high availability etc,becomethestandard effective approach to huge amount of data management.However,the current big data framework also has shortcomings.It can only providethe query program based on the primary key,because oflaking of the relational databases' view and index techniques.Therefore it can not support efficient join,multi-dimensional conditional query and other complex operations.This limits the application of big data management systems in real production environments.In view of the above problems,this paper designs and implements a non-invasive secondary index scheme.The basic principle is to establish a column with the value of the key to the target record RowKey for the value of the secondary index,so that the big data platform HBase can retrieve the corresponding "key-value" data and quickly query to the target record,so supportingcomplex query based on value rather than RowKey,such as efficient retrieving based on value interval in HBase.In this scheme,the C/ S architecture is adopted.In the server cluster,the Observer coprocessor is responsible for the parallel construction of the index data.The EndPoint coprocessor is responsible for the concurrent operation of the query query logic.The client only needs to pass the complex query condition through the Protobuf protocol to the server,and all the target records are returned to the client after the end of the EndPoint coprocessor called.The experiment of the secondary index scheme is based on the big data of environment air quality monitoring.First,the HBase big data storage model is designed to solve the storage of massive air quality monitoring data,and then introduce non-invasive secondary index for complex condition query.The experimental results show that the proposed secondary index scheme can support efficient multi-dimensional query and guarantee higher system throughput rate,achieving the expected target.In summary,the design to make up for the current big data management system deficiencies,extending its query function based on the secondary index,and enhancing system availability.The design scheme of non-invasive secondary index is reasonable and feasible,and has certain theoretical research significance and strong practical value.
Keywords/Search Tags:big data, secondary index, coprocessor, multi-dimensional queries
PDF Full Text Request
Related items