Font Size: a A A

Distributed Index For Spatio-Temporal Data Stream

Posted on:2020-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2428330599976462Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet of Things technology and location-based services(LBS),location-aware services(such as traffic monitoring and friend search)are playing an increasingly important role in everyday life.Spatio-temporal data arrives continuously with high arrival rate and massive data size,and has multiple attributes,which we call spatio-temporal data streams.LBS applications require new query processing techniques to process spatial and temporal domains in real time.Indexing is the infrastructure that speeds up the query process.For spatial indexing,many scholars have proposed a variety of batch construction methods,especially R-tree,but these methods are still insufficient in the face of large data flow scenarios.They usually only consider static data or apply batch updates to existing spatial indexes.The existing batch construction methods can not complete the real-time processing of spatiotemporal flow data in the case of massive spatiotemporal flow data.This paper proposes a distributed memory index for spatio-temporal data streams.The index uses a time window mechanism to divide the spatio-temporal data stream into consecutive time window fragments,and uses an R-tree bulk loading algorithm to construct an inner R-tree for each time window.The root node information of R-tree and the time information of the time window are combined into a <key,value> tuple for constructing an outer B+ tree.Through the two-layer distributed index structure of the outer B+ tree and the inner R tree,the efficient storage of spatiotemporal data is realized.Our main contributions of this paper are as follows:4)A batch loading algorithm called DSort Load is proposed.It combines the bulk load scheme with the STR and constructs the index by first constructing the R tree skeleton and then assigning the keys to the tree later.5)A bulk loading algorithm called SSort Load is proposed.The SSort Load algorithm accelerates the R-tree construction by sorting the sampled time window data,discarding unnecessary time-consuming sort operations.SSort Load is much faster than DSort Load,with low latency and high throughput.It can provide query services quickly.6)A two-layer distributed memory indexing system for spatio-temporal data streams is proposed.Through this index structure,realizing the distributed and efficient storage of spatio-temporal data streams,providing indexing service with high concurrency and low latency.
Keywords/Search Tags:R-tree, B+ tree, memory index, data stream, distributed
PDF Full Text Request
Related items