Font Size: a A A

Application Of Improved Grid Density Peaks Clustering Algorithm In Intrusion Detection

Posted on:2020-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:P P JiangFull Text:PDF
GTID:2428330578455256Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The Density Peaks of Cluster(DPC)algorithm has few input parameters and can recognize any cluster.However,after data volume and data form diversification,the shortcomings of this algorithm also appear: DPC algorithm needs to manually select the clustering center,if the selection is not accurate,the clustering effect is not ideal;When processing large data sets,DPC algorithm needs to spend a lot of time and space to calculate the distance matrix;The DPC algorithm mistakenly divides a cluster into multiple density peak clusters.To solve the above problems,this paper proposes an improved density peak clustering algorithm based on grid partitioning,which is called G-DPC algorithm.This algorithm is a combination of the grid clustering algorithm and DPC algorithm.It adaptively selects the core grid representative points as the clustering center.The data points in the cluster take the grid as the unit,and the noise points are screened out for the cluster merging that meets the merging conditions.In processing large-scale data sets,the advantages of grid clustering algorithm can be brought into play to avoid the problem of memory overflow.In order to verify the clustering effect of G-DPC algorithm,tests were carried out on low-dimensional standard data sets and high-dimensional data sets,respectively,and the results well verified the effectiveness of the algorithm.Finally,the G-DPC algorithm is applied to intrusion detection.The results from the KDD CUP99 dataset test verify that the overall performance of the G-DPC algorithm is improved compared with the DPC algorithm.The main research results of this paper are as follows:(1)An improved density peak clustering algorithm based on grid partitioning is proposed.The algorithm is composed of five parts: grid division,data clustering,adaptive selection of clustering center,cluster merging and noise point processing.It has the advantages of identifying arbitrary clusters with few input parameters and can process large-scale data sets efficiently.(2)A method to automatically select the core grid representative points is presented.Through the self-adaptive center selection formula,under the two conditions of meeting the clustering center,the automatic selection of clustering center is realized to solve the problem of too large error when manually selecting the clustering center.(3)A new noise point removal criterion is defined.Noise points are screened out from the set of potential noise points,and the selection of noise points is more refined.(4)The idea of merging clusters is adopted to merge the clusters that meet the conditions of merging,which can avoid the situation that there are multiple density peaks in a certain class.
Keywords/Search Tags:data mining, clustering, intrusion detection, peak density
PDF Full Text Request
Related items