Font Size: a A A

The Detection And Filtration Of Anomaly Network Flow Based On Decision Trees

Posted on:2014-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:W B DingFull Text:PDF
GTID:2268330401964553Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
While the rapid developing computer network brings convenience to people, the emergence of abnormal flows make it bring many security problems to people too. Currently, Detecting systems have more or less insufficient, such as the Intrusion Detection System (IDS). Commercial IDS lacks self-learning ability, thus developing and sustaining personnel have to update the virus data in time to make the system work normally.With the coming of machine learning and data mining, new ideas and methods have emerged in checking abnormal network flow. The attacking characteristics of abnormal network flow makes it more or less different on the flow characteristics compare to normal network flow, and these differences can use models and rules excavated by machine learning and data mining.Research and experimentation on abnormal flow detection using machine learning are emerging, mostly with good results. However, due to the complexity and diversity of the abnormal flow of the network, there are still many problems in these studies and experiments; For example, some experiments require a lot of training samples to be effective; while other experiments still get relatively high false negative and/or false positive rate; Therefore, the study of abnormal flow detection method based on machine learning still has a long way to go.In the thesis, throught in-depth research on classification algorithm of decision tree, random forests and AdaBoost classification algorithm, an AdaBoost group algorithem (AdaBoosts) which is based on C4.5is proposed, the algorithm introduce voting mechanism of random forest decision tree to AdaBoost. Trained by data, the algorithem obtains an AdaBoosts, then uses the AdaBoosts to determine whether a network flow is abnormal or not from the results of the AdaBoosts. The algorithem trains weak classifiers using C4.5, and then constructs an AdaBoost by weighted combinating these weak classifiers. In order to reduce the degree of similarity between AdaBoosts, the thesis handles original sample set and attributes set by introducing four randoms to get sample subset and attributes subset for every AdaBoost. In order to verify the effectiveness of the algorithm, the thesis designs and realizes an abnormal stream detecting system; the system realizes extracting attributes from network stream by OPNET platform, and then realizes detection of abnormal streams by AdaBoosts algorithem. The thesis uses Wireshark to capture network flow as part of experimental data exept for the KDD data set and DARPA packets. The system is tested by the data sets at the end pf the thesis. The thesis analzes test results, and tells us that the AdaBoosts can obtain higher detection rate by using less training data, and is not much worse than AdaBoost on time efficiency.
Keywords/Search Tags:Decision Tree, AdaBoost, Random Forests, C4.5, Abnormal Network Flow
PDF Full Text Request
Related items