Font Size: a A A

Analysis And Application Development Of Hadoop Distributed Computing Platform

Posted on:2015-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:T W XiaoFull Text:PDF
GTID:2208330434959737Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Hadoop is a kind of software platform which is launched by Apache Software Foundation. It can be used to implement large-scale distributed computing. It provides a distributed file system and parallel execution environment, which allows users to easily deal with huge amounts of data in a distributed environment. Now the platform has been widely used in the field of cloud computing.This paper begins in three areaswith the Hadoop Distributed File System platform, distributed computing model, and mission control in the distributed environment, to discuss the basic working principle of the platform architecture and the calculation process, to clarify the operating principle of some key members in Hadoop framework and its overall implementation. And then, a detailed design and implementation about a validation application for Hadoop platform is put forward, a web crawler for thyperlink URLs based on Hadoop platform.The application can be run on Hadoop. It implements a specified depth collection for Web page hyperlink address in the form of distributed processing. The program is a useful practice on the Hadoop platform programming and environment configuration.
Keywords/Search Tags:Distributed computing, Hadoop, HDFS, MapReduce
PDF Full Text Request
Related items