Font Size: a A A

Distributed Parallel Spatio-temporal Index Techniques

Posted on:2015-04-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z F ZhengFull Text:PDF
GTID:1220330431970432Subject:Geographic Information System
Abstract/Summary:PDF Full Text Request
The real world is an eternal four-dimensional space which changes and generates many kinds of massive spatio-temporal data all the time. The spatio-temporal data can help us understand the history, grasp the current, predict the future so that it improves our capabilities including the insight, perception, and forecasting of the four-dimensional time-space. How to effectively store and manage these massive spatio-temporal datasets, is still one of the key technical issues in the field of new generation geosciences information system which manages massive spatio-temporal data based on distributed collaboration and high-performance computing. The new generation multi-dimensional (mainly four-dimensional) spatial information system is currently only at the beginning of the research stage, and many technical bottlenecks still exist, for instance, big spatio-temporal data management, scheduling and fast retrieval, multi-level caching technology. And the efficient multi-dimensional spatio-temporal index is the base of the solution to these problems. With the development of computer hardware, multi-core computer has become conventional computing device. How to construct an effective architecture for distributed spatio-temporal index and reduce the time and space costs of multi-dimensional parallel spatio-temporal index method due to frequent update in distributed environment, has become one of the key scientific and technological problems.Currently, most of the spatio-temporal studies mainly focus on the local spatio-temporal index but rarely on the studies of distributed spatio-temporal index and parallel spatio-temporal index. Additionally, these two aspects are often studied separately. There has no direct integrated research on the distributed parallel spatio-temporal index. To reduce the time and space costs of the parallel spatio-temporal index, the existing methods always concentrate on the concurrency control algorithms themselves, but lack the concentration on the spatio-temporal index structure itself. The most used tree-like spatio-temporal indexes are hierarchy structure so that it restricts the implement of the parallel algorithms. The lack of researches on the integration of distributed spatio-temporal index and parallel spatio-temporal index and the parallel spatio-temporal index parallelization have become a serious problem which blocked to resolve many problems in the field of spatio-temporal database, for exmaple, distributed parallel cache, pre-scheduling and scheduling, and spatio-temporal analysis. Therefore, it is important that we should begin to study the spatio-temporal methods which are parallel structure and the integration of distributed spatio-temporal index and parallel spatio-temporal index immediately.In response to these requirements, a multi-level adaptive architecture of distributed parallel spatio-temporal index (DPSI),which is suitable for the distributed parallel computing environment and an interval-relationship-operator-based parallel spatio-temporal index, which is suitable for being the local index of DPSI, were proposed on the base of National High Technology Research and Development Program project (2012AA121401,2008AA121602) and the National Natural Science Fund Project (41101368) research results. It broke the multi-dimensional index tree hierarchy restrictions on parallel algorithms, refined parallel granularity and reduced concurrency control. The master-salve distributed parallel spatio-temporal index (MSDPSI) and peer-to-peer distributed parallel spatio-temporal index (PPDPSI) were designed and implemented to enhance the multi-dimensional parallel spatio-temporal index performance in the distributed parallel computing environment. The main work of this paper is as follows:(1) Summarized the related work of distributed parallel spatio-temporal index. At first the research objective and scientific significance were proposed, and then stated the related distributed parallel spatio-temporal indexing technology contexts:common spatio-temporal index, distributed spatio-temporal index and parallel spatio-temporal index, analyzed the state-of-art and the existing problems and presented the main research contents, methods and technology roadmap. In addition, this paper discussed the view of time and space, the space and time expression methods, and spatio-temporal objects’traits which are related to spatio-temporal index.(2) Proposed a theoretical architecture and key algorithms for the distributed parallel space-time index. The spatio-temporal data partitioning method and formal description of DPSI have been studied. The global framework has two patterns:master-slave pattern and peer-to-peer pattern. The local index uses the interval-relationship-operator-based parallel spatio-temporal index (IPSI). According to this architecture, the MSDPSI and the PPDPSI methods have been presented. At first, the DPSI algorithm principle was proposed and then implemented the MSDPSI and PPDPSI and tested their performance. The test results show that both of them have good query performance, however the PPDPSI has better self-governing and extendibility but lower update performance. The MSDPSI has more efficient update performance but it also has the master server performance bottleneck limit.(3) Proposed the interval-relationship-operator-based parallel spatio-temporal index (IPSI) method. We have studied the expression methods of the spatio-temporal data and the interval data and proposed the formula of the transformation between the spatio-temporal data and the interval data. These formula was used to transform the spatio-temporal query into some interval relationship operators. It is the theoretical base of the IPSI. The IPSI’s algorithm principle and data structure were proposed in this paper. Then the update and insert algorithms were designed and implemented based on the principle and data structure. The experimental results show that the IPSI has excellent query and update performance on multi-core computer.(4) Developed a distributed parallel spatio-temporal data engine (DPSDE) and a spatio-temporal database management system prototype based on the DPSI. These two have been used in several cities for spatio-temporal data management. It proved the DPSI has effectiveness and practicality.Distributed parallel spatio-temporal index has two main breakthrough directions. One is the use of advanced concurrency control technology to achieve distributed parallel features, and the other is to try to make the index itself has distributed parallel structure. Our research work is focused on the second aspect. Among the above research work, there are two main innovative results:(1) Proposed an adaptive multi-level architecture and algorithms for distributed parallel spatio-temporal index. To solve the massive spatio-temporal management problems in different network environment and parallel computing environment, a multi-level adaptive distributed parallel spatio-temporal index architecture, called DP SI, has been designed. The computing resources are classified into multi-levels, such as global network, network node, CPU and cores in this architecture. The DPSI, which was constructed on the base of interval-relationship-operator, can adjust a network node loading dynamically according to the node’s computing capability, develop the parallel computing capability of a single node. It helps improve the performance of the distributed parallel spatio-temporal index. Additionally, MSDPSI and PPDPSI distributed parallel spatio-temporal indexing methods have been proposed. Most of the existing distributed spatio-temporal index methods focus on the distributed features, but not the parallel computation of the network nodes. Hence two spatio-temporal indexing methods based on DPSI, named MSDPSI and PPDPSI, have been proposed. These two used master-salve mode and peer-to-peer mode for different network structure. The local index is IPSI method. These two methods that both have the abilities of dynamic node management and load balance, enhanced the index autonomy and extensibility, improved the integrated performance of the DPSI.(2) Proposed an interval-relationship-operator-based parallel spatio-temporal index. To break through the parallel computing limitation to the multi-dimensional tree-like spatio-temporal index, an interval-relationship-operator-based parallel spatio-temporal index, called IPSI, has been proposed in this paper. A transformation method between spatio-temporal query and interval relationship operators which can be computed in parallel was proposed to support unified query interface. IPSI has a lot of virtual binary trees which only contain leaf nodes. It can reduce the tree’s IO numbers and enhance the virtual binary tree performance. In addition, the fact that the start value of an interval is always less than the end value, decreases the computation of the index dramatically. Therefore, the IPSI index solved the high spatio-temporal data coupling problem and parallel limitation problems, developed the multi-cores parallel computing advantages, and enhanced the indexing performance.The study results in this paper provide a viable, universal, efficient distributed parallel spatio-temporal indexing solutions for three-dimensional, four-dimensional or higher dimensional fast retrieval of spatio-temporal data. The following research topics will focus on the cost model and some security problems of DPSI.
Keywords/Search Tags:Spatio-temporal Query, Spatio-temporal Index, Distributed Computing, ParallelComputing, Interval Relationship Operator
PDF Full Text Request
Related items