Font Size: a A A

Research On Key Problems For Parallel Processing Of Remotely Sensed Imagery Using MapReduce Model

Posted on:2015-01-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y XiaFull Text:PDF
GTID:1310330428475340Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
With the continuous improvement of sensor technology, the volume of remote sensing data has exploded in recent years and is expected to continually increase. This explosion in the amount of collected information has rapidly created huge processing challenges for remote sensing application and researching fields. Extracting hidden information from such large-scale remote sensing datasets for agricultural monitoring, environmental analysis and disaster evaluation is computationally expensive. For addressing the ever-increasing computational requirements introduced by many time-critical remote sensing applications, many research efforts have been recently directed towards the incorporation of high-performance computing (HPC) techniques in remote sensing missions.Scalability is one of the most significant features of remote sensing parallel algorithms for processing large-scale remote sensing datasets. However, in current remote sensing parallel processing researches, the most widely used parallel programming models, such as MPI, CUDA and OpenMP, are universally lack of scalability. Although these parallel models perform well in small-scale real-time applications, they cannot meet the large-scale computational needs of the real world near real-time applications. MapReduce, proposed by Google, is now the most widely-used framework for high-performance computing in clouds. Due to its high scalability, reliability and low cost, there are growing number of researches on leveraging MapReduce for processing big remote sensing data these years.This paper aims to address some key issues, from both algorithmic and applied perspectives, on leveraging MapReduce for processing large-scale remote sensing data. From algorithmic perspective, one key issue is the parallelization of global algorithms in remote sensing by using MapReduce model. Another key issue the performance bottleneck existing in the parallelization of iterative algorithms in remote sensing processing. From applied perspective, the key issue is that in current research works, MapReduce (Hadoop) platforms were deployed directly on physical machines. And this deploying approach limits the rapid scalability of MapReduce.At first, we proposed MapReduce programming design patterns for parallelizing remote sensing algorithms by using MapReduce, based on summarizing and refining current researches. The programming design patterns include two main parts:the independent mode and the modular mode.To address the issue of parallelization of global algorithms based on MapReduce, we choose Kaufman's clustering initialization algorithm as the specific research object. Kaufman's initialization algorithm is a typical global algorithm for processing remote sensing imagery. This paper presents MapReduce-based Parallel Kaufman (MPK), a parallel Kaufman implementation for accelerating the initialization step clustering by using MapReduce. As part of MPK, Grid-based Sequential Systematic Sampling (GS3), a new data partitioning method for remote sensing imagery, is also presented.To address the bottleneck issue of parallelization of iterative algorithms by using MapReduce, we choose ISODATA clustering algorithm as research object. In this paper, we present Scalable Parallel ISODATA (SPI), a parallel ISODATA clustering algorithm by using MapReduce, for processing massive remotely sensed imagery. SPI overcomes this bottleneck issue by a) a Parallel Global SubSampling (PGSS) method for data decomposition, b) centroids filtering algorithm for refining intermediate clustering representatives as well as c) a single-pass final clusters mapping algorithm for getting the final clustering results.Aiming at addressing the bottleneck issue of large-scale remote sensing imagery transferring under distributed environment, we propose BTI-Stream model for remote sensing imagery transmission based on P2P BitTorrent protocol. The experimental results show that the model could maintain the stability of the whole peer-to-peer network even in the flash crowd scenario, without losing the efficiency of imagery transferring.At last, we deployed a MapReduce remote sensing processing system based on private cloud environment. This novel deploying approach addresses aforementioned applied issue, and affords the MapReduce system a rapid and elastic scalability characteristic, benefiting from cloud computing visualization technology.
Keywords/Search Tags:remote sensing parallel processing algorithm, MapReduce, clusteringinitialization algorithm for remote sensing, clustering, ISODATA, remote sensingimagey streaming
PDF Full Text Request
Related items