Font Size: a A A

Large-scale Network Anomaly Detection Based On Data Mining

Posted on:2019-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:H M HuFull Text:PDF
GTID:2428330572955907Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of modern computer networks and information technologies,the Internet has achieved global information sharing.While enjoying the convenience of the network system,it has also suffered more network attacks,And the privacy of personal information of network users has been threatened.Network security protection has become an urgent issue that needs to be resolved.Network intrusion detection system is an important measure to protect the network from attacks.It ensures the security,reliability and integrity of data while ensuring the network system is fast and efficient.at the same time,the development of information technology leads to high dimensionality and high complexity of network data degree.Data mining technology can quickly and effectively process massive amounts of data.Moreover,in order to enhance the network attack defense technology further,many experts and scholars have proposed to apply data mining technology to network intrusion detection to further improve the detection efficiency,and it has been analyzed and studied.Based on the traditional anomaly detection method,this paper proposes a cluster pattern recognition method based on statistics.And the network abnormal behavior is identified by analyzing the distribution characteristics of the distances from the objects in each cluster to its cluster center.In current network intrusion detection,outliers are detected through cluster analysis to find abnormal network behavior basically.The traditional network abnormal behavior detection is based on the assumption that the abnormal behavior is a small number of scattered and very different from the normal behavior data,and it does not recognize the aggregated cluster pattern.In a real network system,it is impossible to predict the size and diversity of normal and abnormal behaviors,resulting in the existing anomaly detection system to be inefficient or even ineffective in a real network environment,especially when an attacker sends a large amount of invaded data with high similarity to normal data disguised.The method proposed in this paper breakthrough the limitation of the traditional anomaly detection method assumption that abnormal data is the isolated point.At the same time,in order to improve the accuracy of clustering,this paper uses entropy method to assign weights to the attributes of each dimension of the data object,optimizing the similarity metrics of data objects in the nearest neighbor clustering algorithm.The performance of LOF algorithm is poor when processing a large amount of highdimensional data.The time and space complexity are very high.In order to improve the defect of LOF algorithm,this paper proposes that the LOF algorithm based on k-d tree can efficiently process massive data.Using outlier mining techniques to find individuals that are different from normal behavior in a data set,and it divides the data space to constitute a series of k-dimensional hyper-rectangular regions.The k-d tree is used to store the data objects to generate a spatial partition tree and all outlier detections are performed on the k-d tree structure.The structure of k-d tree facilitates quick retrieval of data.It is generally believed that the isolated mechanism which is detected is different from the normal data object,not because of random factors.From the perspective of knowledge discovery,in some specific applications,occasional events are more worthy of attention than ordinary events.This paper uses Matlab to simulate the two improved methods above.The results show that the clustering pattern recognition method based on statistics have good detection ability and adaptability,LOF outlier detection algorithm based on k-d tree is obviously superior to unmodified LOF algorithm in terms of time complexity and number of calculations.
Keywords/Search Tags:Anomaly detection, Clustering model recognition, Isolated point detection, k-d tree, LOF algorithm
PDF Full Text Request
Related items