Font Size: a A A

Research And Implementation Of Network Data Analysis System Based On Spark

Posted on:2020-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:W H ZhengFull Text:PDF
GTID:2428330590983187Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,more and more network data with diverse characteristics has appeared.At the same time,the emergence of new kinds of attacks aimed at these network data has made network security problems thornier.As a result,it has been a very significant issue that how to take full advantages of big data to realize the analysis and detection of network anomaly data.In recent years,with the development of artificial intelligence technology,it has been proved that the deep learning performs better in the analysis of large amounts of data.However,deep learning can lead to large consumption of computing resources,which can be effectively solved through the combination of deep learning capabilities with big data processing capabilities.To address this problem,a network data analysis system based on Spark is designed.Firstly,the deep learning framework Keras is combined with the big data processing platform Spark to extend the deep learning ability for Spark and realize the distributed computing of deep learning,which can bring the acquisition and processing of big data,the training and deployment of data models into a unified distributed cluster.Secondly,the system can monitor the network data at runtime and make judgments and responses to anomaly data in time based on the characteristics of Spark Streaming.What's more,we can adjust the effective parameters to realize the performance analysis and prediction of Spark Streaming.Finally,a dynamic adjustment strategy towards time interval is designed to achieve a balance between latency and throughput of real-time systems and improve computing performance of Spark Streaming.Compared with the traditional distributed Keras deep learning cluster,the time overhead of large amounts of data transferred between two independent clusters of deep learning and data processing can be reduced in the network data analysis system based on Spark.At the same time,applications on Keras can be easily migrated to this platform,which can decrease the system complexity.For Spark Streaming,the system can adapt to changes from external conditions better,maintain system stability,and improve computing performance.
Keywords/Search Tags:Abnormal detection, Deep learning, Big data, Distributed computing, Stream processing
PDF Full Text Request
Related items