Font Size: a A A

The Management Of Massive Images Data Based On Hadoop

Posted on:2012-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2178330335465453Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Over the past decade, with the development of science and technology, extensive use of computers, especially the rapid development of the Internet, the sources of data increase constantly and the amount of data grows rapidly, the data has reached the PB level, such as the transaction data of Alibaba and eBay, the real-time image data of monitoring systems, the log data of Tecent and so on. Compared to text data on the Internet, the images data increase more rapidly, which are the new challenges towards to the effective management of images data. How to effectively store and manage the images data becomes a new hotspot. In this context, the existing solutions such as be-fore the common massive images data management system are not well adapted cur-rent applications, so the new solutions and systems are constantly being proposed.Based on the background that the problems of the management of massive im-ages data and new solutions are constantly being proposed, and compared to other management systems and solutions of images data, this paper analyzes the back-ground of massive images data in the generation and application, and the system ac-cording to how Hadoop storages and manages Web data and log data successfully, re-search large-scale mass image data management based on Hadoop. In accordance with the Hadoop distributed file system and MapReduce parallel programming framework are an open source implementation of Google's papers, mainly used for Web data management and mining, so Hadoop has shortcomings when used to the images data storage and management, therefore this article extends Hadoop's corres-ponding function modules firstly, design and realize a massive images management system based on Hadoop secondly, such as massive images data storage, massive im-ages data pre-processing, massive images data query and browse. The paper focus on the realizations of data loading, data services and data request. Besides above, the de-signs and experiments of parallel images process algorithms are other important im-provements, and the paper give the experiments results. Because remote sensing data have both mass metadata items and files are usually great, we adjust the Hadoop sys-tem configuration and test the performance of Hadoop to make it become an effective platform of the management of massive images data.
Keywords/Search Tags:Hadoop, Massive image data, Distributed, MapReduce
PDF Full Text Request
Related items