Font Size: a A A

Research Of Intrusion Detection Model Based On Data Stream Feature Selection And Classification Algorithm

Posted on:2017-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:L FengFull Text:PDF
GTID:2308330485998902Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, people’s work and daily life don’t behave well without the Internet, because of the rapid development of the Internet and gradual popularization of the network. Big Data has made the reality get into the Internet. The Internet environment is also complex and not safe enough.In order to solve the problems mentioned above, the research work in the following section is carried out.The existing intrusion detection techniques are analyzed and compared in detail. It is found that the data classification algorithm of the data mining has a very wide application prospect in the research of intrusion detection. The process object of traditional intrusion detection system based on data mining algorithm is static data block, and the data need to be scanned repeatedly in the process of intrusion detection. However, the research object has become data stream and the cost of scanning data is very expensive. Our research need to overcome the following challenges.1) Only once time of scanning data;2) The memory usage of the algorithm has no relationship with the number of the data samples;3) Rapid classification process;Hoeffding tree is a classification algorithm adapt to data stream. We improve Hoeffding-ID (Hoeffding-Intrusion Detection Tree Algorithm) based on the original Hoeffding tree and fix the drawback of the Hoeffding tree. Hoeffding-ID only needs to scan the data once time, the memory usage of Hoeffding-ID is almost used to save the statistics information of the attribute and has no relationship with the number of the data samples. Therefore, Hoeffding-ID is suitable for the data stream environment theoretically. It is clear that Hoeffding-ID has a relatively high classification accuracy and low error rate comparing with J48 classification algorithm which is an outstanding classification algorithm with high accuracy and low error rate. Besides that, Hoeffding-ID performs better than J48 in the field of memory usage.In order to reduce the memory usage and accelerate the classification process, the date sample needs to preprocess to reduce the dimension. So feature selection algorithm has become the key to improve the efficiency of data stream classification algorithm.In this paper, we propose set unit mutual information based on mutual information metric. In theory, the subset of mutual information can be more reasonable to measure the influence from the feature set to the classification results. The proposed feature selection algorithm also introduces the Hoeffding inequality as the selection criterion of feature subset. We choose the classic BIF feature selection algorithm as the comparing object in the experiments. The classification accuracy of the subset getting from the HSF is higher than BIF’s. In other words, the characteristic attribute subset selected by HSF algorithm is more accurate than BIF.In the last section, the improved data stream classification algorithm and feature selection algorithm are integrated into a unified intrusion detection model HIDM with the synchronization mechanism. Experiment shows that HIDM model improves the accuracy of intrusion detection and reduces the detection time. Meanwhile, HIDM model also reduces the memory usage and improve the efficiency of the intrusion detection model.
Keywords/Search Tags:Intrusion detection, Hoeffding inequality, Data stream classification, Feature selection, Mutual information
PDF Full Text Request
Related items