Font Size: a A A

Design And Implementation Of Security Log Analysis Tool Based On ELK

Posted on:2021-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:K R WangFull Text:PDF
GTID:2428330632462700Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the rapid development of Internet technology has brought tremendous convenience to people's production and life,but it also faces greater security issues.Various network security defense technologies are constantly emerging,and log analysis technology has become an important part of an active security defense system,mainly by analyzing and correlating log data generated by operating systems,applications,devices,and security products,and then discovering advanced complex Security attacks.At present,there are many excellent log analysis tools in the industry,such as Loggly,Sumo Logic,Splunk,etc.,but they have the problems of single function,low performance,and no visual interface.ELK is an open source log analysis tool that can efficiently collect logs,Storage,analysis and visual display have received great attention from researchers.However,ELK still has a lot of room for improvement in log preprocessing,data skew evasion,and log analysis capabilities.Therefore,this article combines Spark's advantages in big data processing,and focuses on researching solutions to problems such as performance impact caused by data skew.Designed an implementation framework for log analysis combining ELK and Spark,and proposed a solution to the system performance optimization algorithm based on suffix key value discrete repartition calculation.The main work of this article is as follows:(1)A system performance optimization algorithm based on suffix key-value discrete repartition calculation is proposed.The algorithm first constructs the key value frequency distribution map based on the data volume information,and defines the data tilt rate based on the data model of variance and average absolute deviation;then adds a random suffix to the data volume imbalance key value;secondly,based on the discrete re-partition algorithm scattered concentration Computational tasks;Finally,to achieve a uniform distribution of computing tasks.Through experimental verification,this scheme can effectively partition tasks evenly and improve the performance of log analysis system job execution.(2)A data preprocessing method based on mixed threshold session identification is proposed.The data preprocessing method is divided into three steps,data cleaning,user identification and session identification.The log data is first checked for completeness;secondly,users are identified based on IP address,ID,and network topology;and finally,each set of sessions is identified based on a mixed threshold session identification algorithm,which defines a sliding session based on session duration and access interval threshold Duration threshold,use the threshold to determine whether the user starts a new session.(3)Designed and developed a set of ELK-based security log analysis system.First,analyze the overall functional and non-functional requirements of the system;second,plan and design the architecture;and finally design and implement each module based on the ELK technology stack,Spark big data processing framework,and RabbitMQ message queue.(4)Test and verify the relevant functions of the ELK-based security log analysis system.Experimental results show that the system can correctly collect,analyze and store logs,and provide log query and display functions.At the same time,when the system detects an abnormal log,it can send it to the system staff in the form of alarm information to provide it with a visual view of the log to quickly locate the cause.After the optimized system,the index delay is within 10 milliseconds,the CPU usage is below 50%,and the execution efficiency increases by nearly 50%,which has a certain application reference value.
Keywords/Search Tags:Log Analysis, ELK, Spark, Performance Improvement, Log Preprocessing
PDF Full Text Request
Related items