Font Size: a A A

Design And Implementation Of GTalk Traffic Identification System Based On Machine Learning

Posted on:2014-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:L N MaFull Text:PDF
GTID:2268330422451984Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of network technology, theincreasing number of real-time communication tools to effectively control thenetwork traffic cause great difficulties. In the huge network traffic, the existence ofmany abnormal traffics is inevitable, and there are some users using some softwaresfor illegal operations malicious behavior. Google Talk instant communications toolis a typical representative. It can penetrate the network as a network proxy, whichcauses great trouble to the network security and network management. Therefore,Google Talk traffic effective recognition is very imminent. Most of the Google Talktraffic recognition system can only recognize the Google Talk traffic flow, but cannot be subdivided into Google Talk. This paper is focused on the efficient andreal-time recognition of the network traffic and Google Talk subdivision into fourflow problems. Through the recognition technology of four traffics in the system,and based on machine learning technology, and the remaining three kinds of trafficrecognition technology as an auxiliary, the purpose of Google Talk trafficsegmentation is achieved.Based on the study of the Google Talk practical significance traffic recognitionsystem, the research status at home and abroad is analyzed, in the summary of theproject’s needs analysis and design phase, the establishment of a user’s use casemodel to determine the functional requirements and non-functional requirements,and to determine the technical solutions, design a machine learning-based GoogleTalk traffic recognition system. In the detailed designing stage, the system is dividedinto three modules, namely traffic capture module, traffic recognition module, logmodule. Traffic Capture module is using online and offline are two ways to capturenetwork traffic. Traffic flow recognition module is completed Google Talkrecognition and segmentation. Log module is finished by viewing the log, analysinglogs and statistics. In the system implementation stage, through the process flowdiagram of the system described in detail the key operational processes, withemphasis reflects the three categories of machine learning algorithm model-based,supplemented by the other three traffic recognition technology, and use several setsof data on the same machine learning the three classification algorithms to assess thefinal selection of the best assessment of the C4.5decision tree algorithm model ofthe system as the final algorithm. In the system testing stage, further validate thesystem’s functional requirements and non-functional requirements, reflecting a verygood accuracy and usefulness of the system. The advantages and disadvantages are concluded at last.System has been formally launched to use, and four kinds Google Talk trafficdetection models are being deployed on-line work. During the time of on-lineoperation, the system is running well. By the late tracking and feedback, the systemcan detect Google Talk flow well.
Keywords/Search Tags:Traffic Identification, Machine Learning, port filtering, pay load, Packagesize distribution
PDF Full Text Request
Related items