Font Size: a A A

Research On Key Technologies Of Network Intrusion Detection Based On Spark Platform

Posted on:2021-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z F PeiFull Text:PDF
GTID:2428330614458361Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Intrusion detection as a dynamic network security method can effectively protect computer systems and networks from intrusion,and complements static security methods to form a network security line of defense.In the past few years,network intrusion detection technology based on data mining has gradually become a research hotspot.However,there are some problems in the research process.For example,the redundant features in the data affect the detection effect,the algorithm is difficult to handle abnormal data and the processing speed of large-scale data is slow.Therefore,this thesis adopts the feature selection method,and combined with the distributed computing platform Spark to conduct in-depth research on network intrusion detection.The specific work is as follows:1.The network intrusion detection data contains redundancy and noise features that affect the detection effect,and the data dimension is too high,which makes the training time and detection time of the classifier too long.So a hybrid feature selection method based on adaptive genetic algorithm is proposed.Firstly,the Chi Square filtering algorithm is used to delete features with low redundancy and low correlation.Secondly,Light GBM classifier combined with an adaptive genetic algorithm is used to form a hybrid feature selection method to search for a subset of features with good classification results.Experimental results show that the method has better feature reduction capabilities than the filtering and wrapper methods,and the selected feature subset has higher detection rate and lower false positive rate on different classifiers.2.The K-means algorithm has a poor clustering effect when dealing with linear inseparable data and the cluster distribution is non-elliptical in network intrusion detection,so a parallel K-means optimization algorithm based on Gaussian kernel function is proposed.Firstly,the Gaussian kernel function is used to map the network intrusion detection data to a high-dimensional feature space to increase the difference between various types of samples.Secondly,the network intrusion detection data is clustered by the optimized kernel K-means algorithm,this algorithm can still obtain correct clustering when the K-means algorithm fails.Finally,in order to process large-scale network intrusion detection data,the optimized kernel K-means algorithm is implemented in parallel on Spark.Experimental results show that the algorithm has better detection ability than the K-means algorithm,and has a better speedup and expansion ratio when processing large-scale data.
Keywords/Search Tags:intrusion detection, spark, feature selection, kernel clustering, parallelization
PDF Full Text Request
Related items