Font Size: a A A

Research On Massive Digital Image Data Mining Based On Hadoop Cloud Platform

Posted on:2014-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhangFull Text:PDF
GTID:2248330392461051Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the past few decades, Computer and Internet technology have gotrapidly development, which have greatly promoted the development of allaspects of the society. Computing model has undergone the Terminal-Hostmodel of the mainframe era, the Client-Server model of the personalcomputer era, the Browser-Server model of the currently Internet era, and theCloud Computing model until recent years.Cloud computing is an emerging computing model, which is theintegrated development of parallel computing, distributed computing and gridcomputing, providing unlimited computing resources in the form of simpleand transparent services. The basic principle of cloud computing is providingcomputing, storage, software and hardware services based on a large numberof non-local computers making up the pool of resources, from which userscan get services through the network, to improve the resource utilization ratio,by on-demand access and charging according to the time. Virtualizationtechnology, distributed parallel computing, distributed storage, and distributeddata management are the key technologies to achieve cloud computing.With the rapid development of image acquisition and image storagetechnology, we could be able to get a lot of useful image data, such as satelliteremote sensing image data, medical image data and so on. Image data miningis to analyze these useful images and extract useful information from them.How to effectively store the growing number of images and fast do datamining has become the biggest problem that we face. This paper tries to use Hadoop cloud platform to do massive digitalimage data mining. With the help of Hadoop Distributed File System HDFSand Distributed Parallel Computing Framework MapReduce, by the existingtheories and technologies of data mining and digital image data mining, toachieve massive digital image data mining, can solve the above problem.This paper mainly accomplished the following tasks:1. Summarize the knowledge relatived to cloud computing and thedevelopment history of it, give a detailed analysis of the open source cloudplatform based on Hadoop. Summarize the existing theories and technologiesof data mining and digital image data mining.2. Design the type of the key-value pair based on Hadoop MapReduceparallel computing framework to achieve digital image parallel processing,and digital image file’s input and output formats. Prove it works to achievedigital image parallel processing based on Hadoop MapReduce framework byexperiment.3. Design how to build the system of massive digital image data miningby using the Hadoop cloud platform, and prospect it.
Keywords/Search Tags:Cloud Computing, Hadoop, HDFS, MapReduce, DataMining, Digital Image Data Mining
PDF Full Text Request
Related items