Font Size: a A A

Research On Intrusion Detection Technology Based On Random Forest Algorithm

Posted on:2018-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:J L SongFull Text:PDF
GTID:2428330623950758Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The current situation of security in cyberspace is increasingly complicated,and the means of network intrusion are more diversified.The ever-increasing complicated network intrusion enents have brought great harm to the network ecological environment.Therefore,how to detect network intrusion more accurately and rapidly has been the focus of current studies.In recent years,benefit from the rapid development of artificial intelligence in recent years,machine learning also emerges in the field of network security.It achieves intelligent discrimination of abnormal network traffic samples by training the classification model on the datasets collected in real environment.This kind of technology also breaks the limitations of the traditional detection methods based on feature knowledge.Therefore,applying algorithms of machine learning to the field of network intrusion detection is the hot spot of the current research.However,intrusion detection technology based on machine learning still has some problems,such as long response time,high false positive rate and poor scalability.Aiming at the shortcomings of current research on intrusion detection based on machine learning,this research focuses on the optimizations of random forest algorithm,network traffic feature selection algorithm and unbalanced data classification technology to reduce the time overhead and improve the detection accuracy of intrusion detection.The main contents of this paper include:(1)A simple,easy-to-implement and low-cost hybrid feature selection algorithmFilter and wrapper are two prevalent feature selection algorithms.The filter feature selection algorithm collects the basic characteristics of data to independently evaluate the correlation between features.The time overhead of filter is small.However,since the feature selection process is independent of the classification model,the selected features may be redundant or even unfavorable to the data classification.The wrapper feature selection algorithm often combines with the specific classification algorithm to ensure excellent classification performance and obtain optimal feature subset at the same time.This kind of method often has large computational overhead and long training time because it is associated with a particular classifier.In order to improve accuracy and reduce time cost caused by the above-mentioned methods,we propose a hybrid feature selection algorithm based on chi-square test and random forest algorithm.(2)The intrusion detection model based on mixed feature selection and random forest algorithmAiming at the problems such as various intrusion behaviors,difficult selection of data features in high-dimensional networks and high false alarm rate,we propose an intrusion detection model based on hybrid feature selection and random forest algorithm.In the model,the hybrid feature selection algorithm is used to select the optimal feature subset.Then,the model is trained with random forest algorithm to achieve high intrusion detection accuracy and reduce the computational time overhead.(3)The intrusion detection model for unbalanced network dataIn view of the unbalanced distribution of intrusion types in the real network traffic data and the limited detection rate of the few intrusion types by the current intrusion detection methods,this paper improves the SMOTE technology.The SMOTE algorithm combined with KNN is introduced into the proposed intrusion detection model based on hybrid feature selection and random forest algorithm.This model improves the detection rate of unbalanced network data by oversampling the minority samples.Experimental results show that,compared with the commonly-used feature selection algorithms,the hybrid feature selection algorithm designed in this paper is simple and easy to implement.The training time of models on two different test sets are reduced by 29.48% and 15.76% respectively,which helps to achieve more lightweight intrusion detection system.Furthermore,the intrusion detection model based on hybrid feature selection and random forest algorithm proposed in this paper has improved the intrusion detection accuracy by 12.38%,compared with the commonly-used machine learning models.Finally,by improving the SMOTE algorithm and introducing the unbalanced data processing method into the proposed intrusion detection model,this model can greatly improve the fine-grained identification of specific intrusion categories and improve the capability of generalization,which can further help to improve the versatility of the intrusion detection system.
Keywords/Search Tags:Intrusion Detection, Random Forest, Feature Selection, Unbalanced Data
PDF Full Text Request
Related items