Hadoop-based Network Verification Platform Research

Posted on:2012-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:Z M Xu

Full Text:PDF

GTID:2178330335974317

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Cloud computing, a new concept proposed at the end of 2007, is a revolutionary innovation, because it means computing capacity can also be circulated like commodities, such as gas, electricity and water, and it is convenient to use, with low cost. The main difference with common commodity is that it is transmitted through Internet. So far, Google, IBM, Amazon and other IT giants have launched their own commercial cloud computing platform, and made it one of their major development strategies in the future. Therefore, study of cloud computing not only keeps up with industrial technology trend, but also has great applicable value.There are tens of thousands of servers in the back-end system of cloud computing. How to effectively organize such a large amount of servers is a key problem concerning the effective and stable operation of the cloud computing system. A reasonable network topology can not only improve network performance, but also ensure network stability, with normal operation in case of partial node or link failure or congestion. Network topology of cloud computing back-end system is different from the common ones, so we need to reconsider and do more research about it.Data is the information carrier, while information is the content of data. Thus, data is generally considered the basis of information system. Using computer to process data and extracting information are the basic functions of information system. In today's highly information-oriented society. Web is currently regarded as the largest information system, with characteristics such as massive, diverse, heterogeneous data and dynamic change. How to extract valuable information for enterprise quickly out of the massive data has been the biggest headache for programmer in the process of software development. Based on this, the paper analyzes existing key technologies like distributed storage and computing, combines research on Hadoop cluster technology, business demand and actual hardware and software capabilities, then comes up with a data processing model based on Hadoop. From aspects of data structure design, program procedure organization and use of programming techniques introduce the development approach of the model. The model is applied in web log data pretreatment process in network authentication platform. It allows programmer to obtain resources handling the super huge distributed system, without much experience of parallel processing or handling distributed system. The model could also be used in network applications dealing with large amount of data, such as picture storage, search engine, grid computing and so on.This topic is to combine the model with business applications, to better meet the project demand using cutting-edge distributed frame technology. The model could be deployed to instance, to test the model's practical value through experiment, such as high efficiency, low cost, scalability, easy maintenance and so on. With integration with the original pretreatment system, we have make optimization to the primary model,including: optimization of MapReduce operation scheduling,sorting algorithms and fault-tolerant mechanism of cluster system.

Keywords/Search Tags:

distributed data processing, massive data, Hadoop

PDF Full Text Request

Related items

1	Massive Data Processing Application Based On Hadoop
2	Research Of Massive Data Processing In The Vessel Monitoring System
3	Hadoop-based Network Verification Platform Research
4	Research On Distributed Processing Of Massive Video Data Based On Hadoop
5	Research On Hadoop Based Telecom Operators Massive Data Processing Techonology And Its Applications
6	The Management Of Massive Images Data Based On Hadoop
7	Research On Key Technologies Of Massive Network Data Processing Platform Based On Hadoop
8	The Research Of Massive Data Processing In Disaster Monitoring
9	Platform Development On Massive Data Collection And Processing Based On Hadoop
10	Key Technology Research-based The Hadoop Of Massive Data Processing