Font Size: a A A

Research On The Key Techniques Of Massive Image Data Management Based On Hadoop

Posted on:2011-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:S M HuoFull Text:PDF
GTID:2178330338489832Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the persistent increasement of space resolution and spectrum band, extending time span of working hour of all kinds of sensor instrument, the scale of the image data set is growing in a surprisingly high rate. Based on the central management mechanism, the traditional GIS, Geographical Information System, makes the single node's limit easily to be the bottleneck of managing massive image data, going with the phenomenon of―Massive image data held, but little of them being used‖. Most of the image data cannot be made use of, not only the waste of the cost of getting them, but also the waste of storage medium. Just as mentioned above, the researches of massive image data management is an emergent task to do. The Cloud, composed of hundreds of thousands servers machines, scaling down and up easily, can offer huge storage capability and computing power. As the―Cloud Computing‖becoming more and more popular and mature, it is possible for us to manage the massive image data by making good use of the advantages of―Cloud Computing‖.Hadoop, an open-source project managed by the Apache Fund Foundation, can be used to reduce the difficulty of developing distributing and parallel programes. Based on Hadoop, a distributed and parallel programming frame, also known as an open-source―Cloud Computing‖system, the dissertation makes some researches on the parallel construction of image pyramid and the storage of image data of massive image data by using the open-source―Cloud Computing‖techniques. The main contents are as follows:First, the dissertation analyzed the traditional centralized image data management technology, pointing out the shortage of traditional management ways and giving a brief introduction of the Hadoop, which containing of its characteristics and key components.Second, this dissertation made some researches on the image pyramid building through parallel ways based on the MapReduce, a parallel programming model. Considering the characteristics of image data, a parallel and recursive algorithm based on MapandReduce was proposed, which could boost the speed of dealing with the image data greatly by distributing the data to each slave node.Third, the dissertation researched into the distributed spatial index based on the column-oriented database named HBase, a related project of Hadoop, and proposed a index named P2H, standing for―Pyramid 2Hilbert Curve‖, which was suitable for distributed storage of image data and could make the geographical adjacent image data adjoin or near physically. The index could also improve the inquiry efficiency of massive image data. At the same time, a image data storage model was put forward which could be used to store and inqurie the multi-resolution and multi-temporal image data. Finally, based on the above achievements, we designed the experiments, which proved the feasibility reliability and high performance of the methods and solutions.
Keywords/Search Tags:Hadoop, Cloud Computing, Massive Image Data, Image Pyramid, Image Storage Model
PDF Full Text Request
Related items