Font Size: a A A

Design And Implmentation Of Network Flow Data Analysis System Based On ELK

Posted on:2020-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2428330602951860Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the context of the era of big data,network security has received unprecedented attention,but the network security problem is still very serious,mainly reflected in: on the one hand,the types of cyber attacks are diversified,and the volume of security data is exploding.Traditional network security analysis methods cannot meet the needs of massive data analysis.On the other hand,new attack patterns are emerging.To deal with these problems,the big data-based network anomaly detection platform is a good solution.The core concept is: Combine multiple big data technologies to solve the secure real-time processing,analysis,correlation,classification and retrieval of massive data,and realize a series of big data security analysis such as secure visual analysis,multi-source event data association analysis and user behavior analysis.Features.High-performance,low-latency,high-accuracy network anomaly behavior detection through big data access,streaming processing,data mining and other technologies is a problem that the platform is trying to solve.This paper analyzes the big data retrieval technology,designs a network anomaly behavior detection algorithm based on Co-Forest,and implements a network flow data analysis system based on elastic search using ELK architecture.In the system,after the network data is captured,feature extracted,and streamed,it is identified by the trained network abnormal behavior detection model,and the detection result is displayed to the user.Once an anomaly is detected,an alert is sent to alert security personnel to handle and provide backtracking forensics and correlation analysis.Based on the ELK architecture,the system is divided into four layers of Web front-end and visualization,WebService layer,business logic layer and data storage,which realizes the functions of retrieval,association analysis and IP distribution.The front part of the web and the visual layer are designed to obtain the user's view of the IP distribution request through the client,and then pass the request to the logical layer.The logic layer is to call the corresponding excuse to send the field request of the retrieved data type in the obtained request of the web layer to the data storage platform.The data storage platform retrieves the information according to the requirements,obtains the IP information,the number of IPs in the area,and the location of the region where the IP is located,and returns the result to the logical layer.The WebService layer accepts data results from the logical layer and returns the results to the client.A network anomaly behavior detection algorithm based on Co-Forest is designed.The classifier is trained using the marked data set,then the classifier makes a confidence judgment on the unmarked data,and divides the unmarked data set by confidence according to the high threshold and the low threshold,and the high confidence set and the low confidence are The set is added to the marked data set,and then the confidence level is determined for the medium confidence set until the samples are divided or the classifier is stable.Data preprocessing is performed on eight types of data: DNS data,HTTP data,mail data,IP quintuple data,HTTP account data,mail account data,FTP data,and alarm fields.Environmental testing of the system,including hardware testing and software testing.Functional tests of search function tests,abnormal behavior data analysis tests,and association rule function analysis tests were also performed.Finally,the system was tested for performance,including storage capability testing,testing and data analysis performance testing,and abnormal behavior detection performance testing.
Keywords/Search Tags:ELK, Random Forest, Co-Forest, Machine learning
PDF Full Text Request
Related items