Font Size: a A A

Research On Density Peaks Clustering Based On The Grid

Posted on:2018-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2348330569986432Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering analysis is an important method in data mining.It occupies large proportion in unsupervised learning method,and has been widely used in various field of science with good results.The way that people acquire information is enriched by the progress and development of information technology,for which massive information resources can be obtained from the Internet.At the same time,the network visit behavior produces a large amount of network flow data where there is a lot of abnormal data for network anomaly,malicious operation or hacker intrusion.Therefore,how to detect abnormal data from massive data has become an important issue.In this study,there are two tasks related to clustering,one is to build a clustering model in data preprocessing,for data reduction and exploring the inherent distribution of network traffic data;the other is to build an anomaly detection model based on unsupervised learning.The density peaks clustering algorithm,which was proposed in Science in 2014,is a typical clustering algorithm based density.It has attracted the attention of many researchers,since it is novel and can get high accuracy.In addition,it can quickly find the cluster center.This thesis adopts the idea of the density peak clustering algorithm to analysis the network traffic data in the big data environment.According to the characteristic that the density peak clustering algorithm can find the clustering center quickly and solve the problem to cluster effectively,a clustering model combining the clustering algorithm based the grid with the peak density clustering algorithm is proposed.The model based on the multi granularity mesh partition,can be derided into two aspects: coarse-grained and fine-grained grid.the contributions of this model is as follows:1.Coarse-grained meshing.The density peak clustering algorithm based on coarse mesh is proposed in this thesis.Firstly,the data set is partitioned into grids in the coarse granularity,and then the data in each grid cell is clustered separately by the density peak clustering algorithm.Finally,the final clustering result is obtained by merging the results in each grid cell.Simulation results show that this algorithm can effectively process large scale data,with high speed.2.Fine-grained meshing.The density peak clustering algorithm based on fine mesh is also proposed in this thesis,since the algorithm based on coarse mesh loses some global data distribution information.Firstly,the grids are generated according to the fine mesh partition.Then,the central grids are obtained through the idea of density peak.Finally,the grid units which is similar to the central unit are merged to get the clustering results.The simulation results show that the proposed algorithm can quickly get the global clustering center,and that it also covers the shortage of the clustering algorithm based on coarse mesh.Additionally,the model we designed is applied to the web attack detection system based on large data.The system runs stably and works well,which proves the algorithm proposed in this thesis practicable.
Keywords/Search Tags:clustering, density peak, grid partition, web attack detection
PDF Full Text Request
Related items