Font Size: a A A

Research On Building And Using Of Dataset For Intrusion Detection

Posted on:2005-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Z L MengFull Text:PDF
GTID:2168360152969253Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The technique of intrusion detection based on data mining is a hot research to network security. The very important thing to intrusion detection based on data mining are methodology for building and using dataset, i. e. training dataset and testing dataset. The pattern set of IDS are built on training dataset by data mining. The quality of training dataset directly affect the quality of pattern set, and so it affect the efficiency of detection. It's significative on testing a IDS accurately, for it could provided a choice standard for user and a debugging tool for developer. The effect of a good testing dataset to system testing is obvious. Unfortunately, the research on these two fields are only pilot study and it doesn't form the industry guild by far for varity of realistic environment and development system. The data of network traffic will be divided into normal data and attack data. It suppose that the traffic data on a normal network are normal data, so the data on normal network should be captured as normal data. The attack data would be producted by simulating attack behavious in some means. Then the normal data and the attack data would be mixed in a definite proportion to product mixture data. After dataset was producted, it should be transformed to formated dataset with some specific attributes which are chosen according to the given intrusion detection algorithm. Some data should be chosen from the dataset by some proportion and method to form trainning dataset. Training dataset should be optimized by wiping off noisy and atypical data. A methodology called k-NN for IDS is provided, which is a mended algorithm of k-NN, to optimize the dataset. This trainning dataset could be added and expanded easily and it's also more resemble to realistic environment. The testing dataset could be generate in the same way of training dataset, but its data records could be chosen directly from those mixed data. With the limitation of realistic network environment, some attacks couldn't be lunched in specific network. Staged-Mixed-Test can reduce the limitation of realistic network environment to intrusion detection system which utilize the online or offline dataset in different phase according to its cooresponding emphases.
Keywords/Search Tags:Intrusion Detection System, Training dataset, Testing dataset, Vulnerability, Methodology for Testing
PDF Full Text Request
Related items