Font Size: a A A

Research And Implement Of Network Identification System Based On Clustering Algorithm

Posted on:2013-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:B LiangFull Text:PDF
GTID:2248330374482247Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development and promotion of Internet technology has changed people’s lifestyle, online news, online shopping, e-commerce, online chatting and so on. A variety of network applications emerge. Since the1990s, many major breakthroughs in broadband network technology and applications, promote network bandwidth growing. People want to get a better Internet experience, so more data is rapidly passing on the network. This increase in the Internet network traffic challenges the hardware on the backbone network. Able to accurately identify all the packets sent by which network application, we can control and manage the network. We can filter out all network traffic used for illegal network applications, limit the transmission ratio of the network applications with large data translating in network traffic, so as to control the bandwidth used for a variety of network business, to ensure business-critical, to inhibit undesirable network business, to deepen the control of service quality and so on. Accurate and rapid identification of network applications business Category plays an important role in network management and network monitoring.There are two current mainstream network application protocol identification method:load-based identification method and behavior-based identification method. Load-based identification method, using deep packet inspection technology,adopt protocol analysis and restore technology, extract the data from application layer to the analysis of these data with the protocol of characters, match special string in the fixed location of some protocols to determine whether the network traffic using some kind of protocols. Load-based identification method, high accuracy, but this method can only identify known network applications. Behavior-based identification method assumes the object of the same category having a stable set of characteristics, the characteristics can be any attribute information relating to category. From different angle of observing Internet applications we will find a feature set according to the behavioral characteristics. The classification can effectively distinguish between different applications and can identify unknown network applications, but behavioral characteristics can be affected due to network conditions changes in real time.One work of this paper is a new network application protocol identification method network protocol identification method based on clustering algorithm. First of all, through deep packet inspection technologies and matching the regular expression of every protocol, identify part of the network flow in the network traffic; then extract all of the network flow behavior feature vectors, project each network flow onto a multidimensional space; Finally, by clustering algorithm in the multidimensional space, network flows used the same network protocol cluster together, analyze the protocols network flows having been identified belonging to in each cluster, identify all of the network flow. The identification method benefits from payload-based identification method and network flow behavior-based identification method, with the advantages of both. The recognition method can accurately identify known network applications, and can be able to identify unknown network application protocol. In the environment network status changing, the identification method can still efficiently identify network traffic.Another work of this paper is to develop a network application protocol identification framework system based on clustering algorithm. The framework system uses multithreading technology to improve the overall efficiency of the implementation of the system; the framework system uses a modular design with good scalability; the configuration about the sampling algorithm of the system running in network traffic, network flow classification rules, clustering and so on can be modified.The increase of network traffic adds the processing burden to the system, the system will not deal with every packet crawled. Therefore, the network traffic sampling algorithm will be in future research, and more efficient clustering algorithm will also be in the future further research.
Keywords/Search Tags:network flow, traffic identification, the characteristics of network behavior, clustering
PDF Full Text Request
Related items