Font Size: a A A

Design And Implementation Of The Parallelization Based On Hadoop Model Network Traffic Diversion

Posted on:2013-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:T H ZhengFull Text:PDF
GTID:2248330374470362Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the fields of network management, security forecasting and traffic engineering, network traffic classification provides important technical support. Along with the Internet application types constantly new and morphological of rapid change, the significance of the research network classification gradually reflected. Network traffic classification is one of the steps based on traffic shunt. However, network traffic shunt has become the bottleneck of the network traffic classification technology.The cloud computing provides a new way of thinking, transparent and simple programming model for us in the aspect of service development. Hadoop platform, which is putted forward by the company of Apache, is an open source platform and the most widely used cloud computing platform. It implements the MapReduce programming model and the distributed file system (HDFS) using the Java programming language. MapReduce is a simple parallel computing model and provides possibility to implement the large-scale calculation by automatic concurrent and distributed execution, while it is a good scheduling model. Distributed File System (HDFS) is the form of a block sequence to storage files and provides the way of replication block to ensure high reliability. In this thesis, we research the key technology of the MapReduce model and network traffic shunt and implemented network traffic shunt parallel processing system based on the current information about this field. Firstly, we introduce the architecture of network traffic classification, the concept of the network flow and carrying out shunt. Secondly,with a shunt parallel feasibility analysis, we propose how to extract five elements of the network traffic packets and incorporate the packets into a complete network flow in the Linux environment. Finally, we capture the Inner Mongolia University campus network traffic, and experiments are carried out and verified. The results show a significant advantage of the system for large data. Compared with multiple threads on a single computer, the running time is less about3.33times.
Keywords/Search Tags:hadoop, network traffic, traffic shunt, parallel processing
PDF Full Text Request
Related items