Font Size: a A A

Research On Key Techniques Of Anomaly Detection For Big Data Platform

Posted on:2018-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z P QiuFull Text:PDF
GTID:2428330542488088Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of big data technologies,the architecture of big data platform is becoming more complex.At the same time,big data platform for new risks security needs continue to increase.At present,the security of the big data platform is mainly ensured by its infrastructure.For the underlying security mechanism,it laeks semantic interpretation for the anomaly events of big data application platforms.The underlying cloud security mechanism does not have the ability to fully detect and analyze the abnormal events of the upper data processing platform.Firewalls and other traditional security devices can not effectively detect and prevent abnormal events.Anomaly detection technologies are often used to solve the security issues of the big data platform.However,the traditional big data platform tend to adopt single anomaly detection method,which is difficult to meet all kinds of scenes detection requirements of big data platform.Anomaly detection is difficult problems,and now there is no mature solution.Therefore,it is very significance to study the key techniques of anomaly detection in big data platform.Given the above issues,by studying the advantages and disadvantages of all kinds of anomaly detection technologies in order to ensure the safety of the big data platform,an anomaly detection system framework for big data platform was designed by using big data related technologies.This system implements the function of anomaly detection by integrating a variety of anomaly detection algorithms,and improves the method of abnormal detection based on statistical analysis and data mining respectively.In this paper,the main work includes as follows:(l)An anomaly detection system framework oriented to big data platform was proposed by integrating big data distributed software architecture,such as Hadoop and Spark.The system contains multiple anomaly detection algorithms.By collecting and analyzing big data platform server log data,the system can detect network anomalies.The design of this system provides a framework reference for the industry.(2)For the online anomaly detection,an anomaly detection algorithm based on maximal information coefficient was proposed.The main idea of the algorithm is modeling user activity by log data,and using maximal information coefficient as a similarity parameters.The algorithm was implemented by Spark Streaming.Experiments show that algorithm can maintain a close second level processing speed,accurately detect the current window interval network anomalies.(3)For the anomalies from massive historical data,an anomaly detection algorithm based on improved clustering was proposed.By using the improved clustering algorithm to deal with dataset and creating anomaly index of the clusters,anomalies can be digged out in mass historical data.Experiments prove that the algorithm can detect the abnormal in huge amounts of data,in the case of a reasonable threshold,is of high precision.
Keywords/Search Tags:Anomaly Detection, Big Data Platform, Statistical Analysis, Data Mining, Cluster Analysis
PDF Full Text Request
Related items