Font Size: a A A

Research On Analytics Of Distributed Big Temporal Data

Posted on:2020-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2428330620459983Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The temporal data is ubiquitous,and massive amount of temporal data is generated nowadays.Management of big temporal data is important yet challenging.It is a desired choice to handle massive temporal data with a distributed system.However,existing distributed solutions either cannot natively support temporal queries,or are disk-based with I/O bottlenecks,which could not well satisfy the requirements of high efficiency and scalability.This paper proposes an In-memory based Two-level Index Solution in Spark to process big temporal data.With global-local index structure,it can effecively filter candidate partitions with global index while using local index to boost in-partition query,which remarkably improves the query performance of various temporal operations such as time travel,temporal aggregation and temporal join,etc.Furthermore,we designed partition method for temporal data to optimize the partition filtering process with global index.With comprehensive experiments,the results show that our proposed solution can provide big temporal data with low latency and high throughput analysis.
Keywords/Search Tags:Big temporal data, Temporal queries, Distributed in-memory analytics, Two-level index, Partition method, Spark framework
PDF Full Text Request
Related items