Font Size: a A A

Research On Key Technologies Of Intrusion Detection And Alert Correlation Based On Machine Learning

Posted on:2017-03-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:1318330518994047Subject:Information security
Abstract/Summary:PDF Full Text Request
The network technology has been applied deeply into all facets of the people's work and life, the Internet has become an important infrastructure which carries massive information in human society. It has brought great convenience for people's life; meanwhile, various malicious attacks go hand in hand to threaten the network security seriously. Intrusion detection system (IDS) and alert correlation analysis are both key techonologies in network security domain and play important roles in detecting various attacks. IDS discover network attack behavior by collecting and analyzing related data streams to reduce threat.Alert correlation can expand the scope of intrusion detection and imporove the quality of alert through the fusion of multi-source information. Along with the expansion of network scale and the development of network attack technology, the dimensions and amount of data keep growing, which bring more challenges to current intrusion detection techniques and correlation analysis methods in processing massive data. In this paper, researches have been carried out on feature reduction, data stream classification, anomaly detection and correlation rules generation with machine learning techniques, aiming at improving the performance of IDS and the automation of alert correlation. Some innovative achivements have been made in the paper and the main research work is as follows:1. To improve the real-time performance of IDS in processing high dimension data,a feature reduction method based on rough set theory and principal components analysis is presented. The feature reduction technology can reduce the number of features without degrading classification capability and representation capability to improve the efficiency of data analysis. A new feature reduction method is presented based on rough set theory and PCA. The distinction matrix and information entropy are used to complete the feature selection, a weighted kernel function is constructed to complete the feature map and feature extraction. The two processes are carried out iteratively to acquire advanced features, which are more concise than the original ones.2. Classification is often used in misuse detection. Traditionally,labeled data is used to complete the classification model training, but the dynamic flow characteristic and high cost of labeling data bring more challenges to traditional classification methods. A data stream classification method based on decision-feedback manner is presented in the paper to deal with the problem. Firstly, an ensemble classification model is constructed using the labeled data chunks and it is used to complete the rough decision of the unlabeled data. Then cluster models are trained on unlabeled data chunks along with the rough decision result to provide constraint for the ensemble model, and the supervised ensemble model is extended to be a semi-supervised one combining both labeled data chunks and unlabeled ones. The extended model acquires more accurate result through maximizing the consensus of all models, so it improves the prediction performance of the classification model with the useful information of the unlabeled data chunks.3. The principle of anomaly detection is to detect deviation from the established normal behavior model, but it is difficult to guarantee the purity and integrity of the training datasets, so the performance of the model may be affected. An enhanced one-class SVM model based on semi-supervised manner is presented in the paper to deal with the problem. Firstly, the proposed method trains an anomaly detection model based on traditional one-class SVM in unsupervised manner; then a few unlabeled instances are selected to be labeled using active learning method; later, the model is expanded in a semi-supervised manner using these labeled data; the selection strategy and termination condition are revised to satisfy the requirement of both purity and integrity, so the performance of the model can be improved greatly with low label cost.4. Alert correlation is one of the hotspots in network security domain.It analyzes network security events according to predefined correlation rules to reveal the hidden relationship of these discrete events.Unfortunatelly, current researches in the field are mainly focus on the correlation methods and rule expressions, the generation and update of these correlation rules are completed mannully in traditional methods. A rule generation method is presented based on neural network and genetic programming. Firstly, neural network models are trained to classify amounts of security events according to different attack scenarios, rule items are extracted and training sets are established based on the classification results. Then correlation rules are produced and optimized based on genetic programming to complete the generation and update of correlation rules. These rules can be provided to the correlation engine to promote the automation and adaptability of traditional methods.In summary, network attacks are becoming increasingly complicated and diversified, to deal with the challenges of intrusion detection and alert correlation in massive data environment, we make a deeply research on the key technologies in this area based on machine learning technology and propose effective solutions, including feature extraction, dynamic data stream classification, anomaly detection and correlation analysis.Experiment results demonstrate the feasibility and effectiveness of the proposed methods. The results of the research can improve the perfomance of intrusion detection system and promote the automation and adapbility of traditional correlation analysis methods, and then give us a more real-time and accurate potential threat analysis from massive data.
Keywords/Search Tags:intrusion detection, feature reduction, data stream classification, anomaly detection, alert correlation
PDF Full Text Request
Related items