Font Size: a A A

Design And Implementation Of Log Stream Analysis Of Computer Room Security Equipment Based On Spark On Yarn

Posted on:2019-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:J M YangFull Text:PDF
GTID:2438330572953694Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of the widespread promotion of big data,many organizations have put forward higher requirements for the security and stability of computer systems,including data systems,especially the IDC room basic environment that supports the safe and stable operation of computer applications and business systems.In order to ensure the normal operation of the equipment and the stable operation of the network,in the second context,intelligent operation and maintenance management methods based on big data technology have sprung up.Among them,collecting equipment operation logs and doing more in-depth statistical analysis on the basis of collecting logs has become an urgent operation and maintenance method for computer room maintenance personnel,timely discovering system vulnerabilities,network attacks and other behaviors,and adopting targeted measures to solve problems.And can predict problems in advance,has become the dominant direction of many operations.In the analysis and processing of logs,traditional offline analysis processing has been difficult to meet the needs of real-time and accurate warning.At this time,distributed real-time processing frameworks such as SparkStreaming based on big data technology have begun to enter our field of vision.The real-time performance of the log analysis and processing performed is further improved,and the early-warning function of the operation and maintenance of the equipment room supporting the stable operation of the service is more intense in real-time requirements,because this relationship is different for the business system data when the network security and the system security are concerned.A major impact of the ordinary.The computer room log analysis system based on Spark Streaming technology in this thesis can combine the log collected by the computer room with the big data distributed stream computing mode to obtain early warning information.The system designed in this thesis uses the Flume log collection system to collect and collect log data from servers and network security devices.At the same time,the Kafka message subscription system is used to cache messages,and then Spark Streaming is used to analyze and analyze log data.In the Hadoop cluster,the Spark On Yarn mode is used.The Yarn is used to schedule the Spark to improve the running performance of the Spark.The Spark is based on the memory calculation.Therefore,the implementation of the system must be high enough to build a Hadoop distributed cluster environment.It is necessary to advance this factor and then deploy the Spark on Yarn mode on a secondary basis.After the project is completed,the health of the data center server and network security equipment can be monitored,and the early warning results can be obtained through the real-time query of the operation and maintenance management personnel,thereby reducing the risk of network security and equipment security in the data center equipment room.The various devices and information systems running in the equipment room are operated safely and stably.Kafka and Flume are fairly stable systems.The log data collection and transmission work using the two is very stable and will not cause data loss.In some special cases,Flume may cause the loss of transmitted data,but the stable and reliable Kafka can prevent this from happening,because Kafka can be operated in a cluster mode,and can quickly pass when a node has a problem.The recovery mechanism saves the data,so the combination of the two can guarantee the integrity and uniformity of the log data.Combining Kafka and Flume two technical frameworks,the common design pattern is to design the distributed data flow model of Flume+Kafka,Flume as the producer of the message,and our project is to complete the log collection and transmission according to this technology mode.
Keywords/Search Tags:Spark Streaming, Log analysis, Kafka, HadoopDistributed cluster
PDF Full Text Request
Related items