Font Size: a A A

Design And Performance Analysis For Hbase-Based Query System On Satellite Space Data

Posted on:2016-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y MaFull Text:PDF
GTID:2348330491960885Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the integration of modern aviation and information technology, data sensing capability and data collection range have been greatly improved. In consideration of the 4V (Volume, Variety, Value and Velocity) features of satellite spatial data, traditional SQL type databases with its way of data storing can no longer meet the demands of fast query and analysis of data due to its lack of scalability and its limitation in parallel data processing. In recent years, a new emerging data storing technique based on Hadoop and HBase technology has been very well developed with its astonishing capability in multi-tasking and data scalability. It is fairly promising in providing us with a very effective way of addressing the above difficulties and limitations in massive data storage and data query.Without proper design of the overall data storage system, due to the time characteristics of satellite spatial data, it is often found in real applications that hot spots are frequently created when data are written into the system where data are lexicographically sorted according to Rowkey using HBase technology. This turns out to have drastically degraded the system's performance in load balancing and fast query; In the meantime as data are continuously collected and saved into the system, some regions in the data table are constantly being merged and split, thus negatively influencing data writing performance of the system. In addition, data query efficiency can be so low that it cannot meet the demands of real system applications when it comes to multi-dimensional spatial data query because it often entails full table data scan according the column keywords using HBase technology. In this paper, innovative system designs are proposed to solve the above difficulties in the following two aspects:In the aspect of storage, discrete data columns and pre-partitioning hash designs are devised to effectively avoid the hot spot problems thus achieving the goal of load balancing and improving data writing performance of the system; In the aspect of data query, a new GKD-HBase indexing model is established which combines the two indexing methods-the Grid tree method and the KD tree method by setting them respectively as the first and second index. Multi-dimensional data can be converted and reduced into one-dimensional data using Hilbert space-filling curves so that they are better prepared for successive data treatments thus further enhancing data query efficiency of the system. Finally, the two aspects of the redesigns are experimentally tested and analyzed. It is confirmed in the results that Rowkey hashes with discrete data columns and pre-partitioning hash design can indeed mitigate the hot spot problems and realize load balancing when processing large amount of satellite spatial data. It is also worth pointing out that data writing performance is optimum when the size of each region is set to be around 7GB. In large data environments, the experimental test also verifies the advantage of the GKD-HBase double indexing technique over the traditional Grid indexing technique in processing massive amount of multi-dimensional spatial data. This provides enormous support to real world HBase applications involving data query and storage of such satellite spatial data.Data mining and data analysis of satellite spatial data can help to reveal much information about any potential air or sea target. For instance, identification and tracking of such air or sea target involves storing large amount of real-time satellite spatial data when simultaneous fast query and processing of these real-time data is a must to compete this task. The architectural design of data storage and query schema proposed in this paper will be of great practical value in any corresponding applications.
Keywords/Search Tags:Satellite Spatial Data, GKD-HBase, Grid, KD Tree, Query System
PDF Full Text Request
Related items