Font Size: a A A

Research On Index And Query Technology Of Spatio-temporal Data Based On Hadoop

Posted on:2019-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:L Z HeFull Text:PDF
GTID:2428330572450212Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of Temporal Geographic Information System and the popularity of intelligent mobile devices,massive spatio-temporal data with great value of information is produced every day.How to store,query and analyze large amounts of complex spatio-temporal data efficiently has become a hot issue for major scientific research institutions and IT companies.Traditional databases are not suitable for storage and management of spatio-temporal data with complex structure and large scale for their limited storage capacity and structured storage.Hadoop,the distributed platform,has natural advantages of processing the storage of unstructured big data.Therefore,this thesis proposes designs and implements a spatial-temporal data indexing and querying technology based on Hadoop.The system can store,index and query large amounts of spatio-temporal data effectively,and has advantages of high concurrency,easy expansion and real-time queries.The overall technical architecture is divided into three layers which are storage system,index system and query engine from bottom to top in the hierarchical logic.The storage system implements the import and storage of spatio-temporal data.The index system supports the construction,storage,loading and retrieval of spatio-temporal indexes.The query engine implements query and analysis of spatio-temporal data and provides interfaces for external applications.The main work of this thesis is as follows:(1)Based on the structural characteristics of spatio-temporal data and the storage characteristics of HBase,the storage model of spatio-temporal data in HBase is designed and implemented.Moreover,the influence of rowkey design on query performance is analyzed.(2)This thesis abstracts the parsing interface for spatio-temporal data files,and implements a flexible data import module,which can be easily extended to support different spatio-temporal file formats.(3)A two-layer distributed spatial-temporal index,named GR-Tree,is constructed based on Guad-Tree and 3DR-Tree.The serialization and deserialization of the GR-Tree are studied and implemented,which makes the index support disk persistence and dynamic loading of the subtree.In addition,the distributed storage of the GR-Tree is optimized to improve query efficiency.(4)Based on HBase coprocessors,three spatio-temporal query algorithms are implemented,namely time-window query,time interval k-nearest neighbor query and topology query of specific mobile object.These algorithms are also optimized to improve query performance.Performance tests to the above-mentioned designs were made in the laboratory environment,of which the results show that the Hadoop-based spatio-temporal data indexing and querying technology can be applied to the actual Hadoop cluster effectively.The performance of time-window query is better than other similar programs and the requirements of real-time and high concurrency for regular spatio-temporal queries are met.
Keywords/Search Tags:Spatio-temporal data, Spatio-temporal index, Spatio-temporal query, Hadoop, Distributed platform
PDF Full Text Request
Related items