Font Size: a A A

Design And Research Of Network Security Data Visualization System

Posted on:2018-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z J NiuFull Text:PDF
GTID:2348330518967149Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The exponential growth of digital information in modern society has promoted the discipline of data analysis into a golden age of vigorous development. All along, people have tried to use data analysis methods to explore the information closely related to us from a steady stream of data resources. In the field of network security, the use of data analysis to solve security problems has become a new approach. The amount of data collected by various security logs is huge and people can't process and use them without the help of analytical tools. Especially, people still need to solve a series of problems, such as quick understanding of network communication mode, identification of network abnormal points and discovery of network attacks. Network security visualization technology is a very practical technology.The application of visualization techniques in the field of network security, the huge network data into visual images,easy to understand, to get the data model and structure by using the human vision, builds a bridge between the data security and cognition. The popularity of visualization in the field of network security is inevitable: the more data people need to filter,the more they want to translate data into images, and the images and text are displayed side-by-side. Visualization has become an important analytical tool, use it can visually show the pattern and rules of safety behind the data shown, so as to help people analyze the status quo of network security, event processing has been found and timely prediction of potential security incidents did not occur. At the same time, visual analysis tools can help us better understand the safety data, it helps people deal with data overload and save time in the process of telling people to information at the same time also let the people involved in data collection and analysis.Based on the network security visualization reference model, this thesis studies and designs a web prototype system for network security data visualization, which is based on the idea of hierarchical architecture. Nets.vis. The system can complete the process from data processing to generating views. Nets.vis prototype system framework is a layered, flexible and lightweight network security data visualization framework. The system uses the server client structure, the client in the browser's rendering, the server provides data access, storage and analysis, and loading visual components. The Nets.vis prototype system consists of the following 7 layers:(1) data preprocessing layer. The main source of data cleaning, dirty data, garbage data,error data removed, to get clean data available.(2) data import layer. This layer is responsible for importing data from the MySQL database into the HDFS.(3) data storage layer. All the experimental data of the Nets.vis prototype system are stored in HDFS.(4) data management layer. The data of the data warehouse of the whole Nets.vis prototype system is managed by Hive, that is to say, all the data is output to the data management layer by the data storage layer in the form of Hive table.(5) data service layer. In this layer, according to the needs of the analysis, based on data warehouse data for various analysis and data mining.(6) data application layer. Data services layer data must be led back to relational databases, because the high latency performed by Hive is not suitable for generating final visualization results.(7) visualization layer. The user views the final visualization results through the browser.The requirements of the whole Nets.vis system can be summarized as follows: data preprocessing, data import, data analysis and view generation. This article mainly launches research work from the following aspects.First, by deploying Hadoop system on the server of Linux system, the storage and management of large-scale data are realized. The Hive data warehouse provided by the Hadoop system can be used for storing data, and Sqoop can realize the data transmission between the relational database MySQL and Hadoop. In the research, the data import, storage and related data analysis module of server are based on Hadoop platform. Use Sqoop to import data from the relational database MySQL into the data warehouse Hive, and then return the analyzed results back to the MySQL database. The client uses the Spring MVC to structure the Web side and uses Bootstrap to optimize the visual interface of the prototype system.Secondly, as the Nets.vis visual prototype system in this thesis often involves queries and other operations, it is important to optimize the efficiency of Hive data analysis module.In this thesis, we use spatial sub linear algorithm to optimize the efficiency of data extraction,conversion, loading and querying. Among them, the Misra-Gries algorithm, which searches for frequent elements, is used to find the most frequent elements. For example, a frequently occurring IP address is found in the network. Calculates the number of different elements in a data stream using an algorithm that calculates the number of different elements, such as the number of IP accesses that can be used to count a page. At the same time, the data analysis module uses Canopy clustering combined with K-means clustering to analyze the source IP.When choosing attribute dimension in data analysis module, this thesis selects correlation coefficient and correlation matrix of probability theory and statistics as a commonly used Pearson product distance to verify the correlation between dimensions.Then, the visualization module of Nets.vis prototype system is designed to filter the data set according to the user's wishes. In the visualization module, including the main use of Echarts and D3 two visualization tools to design the visual component, network security data attributes: maps, Treemap, parallel coordinates, graph, bar graph, line graph and rectangular thermodynamic diagram. This thesis designs and implements a visual component rendering method based on SVG, which can make visual results more rich and intuitive. At the same time, the Brich algorithm is used to improve the layout of the bubble chart.Finally, this thesis uses a visual guide to the overall details, select the part of the visual component in Nets.vis prototype system, to verify the feasibility of the Nets.vis system using Vis China 2015 challenge provided by the Tcp flow log data. The first step is to use hierarchical clustering, improved bubble charts, bar charts, and relational graphs to identify the server and client in the network and mine the topology of the network. The second step is to classify the server according to protocol characteristics and time series characteristics. The third step is mining network traffic characteristics. For mining traffic characteristics,considering the hierarchical structure properties and temporal properties of the combination of network traffic data with the visualization of data: the overall timing characteristics in line chart, find the network "holiday mode" and "working mode". The fourth step is to visualize the local time features of trees using tree graphs, and discover the specific hosts that generate anomalies. The experiment shows that the Tcp flow data analysis using Nets.vis system visualization, realized by the local to the whole network analysis, the system can be completed on the network service and the client, the server determines the classification and identification of network traffic patterns and found abnormal network for analysis, personnel management of the network and the network security situation perception.
Keywords/Search Tags:Security Visualization, Data Visualization, Network Security, Big data analysis
PDF Full Text Request
Related items