Font Size: a A A

Uncertain Data Outlier Detection Algorithm And Its Application In Network Forensics

Posted on:2019-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y F WangFull Text:PDF
GTID:2438330545493149Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the popularity of networks,the amount of data in various fields has grown dramatically,and the importance of networks in people's daily life has become increasingly prominent.However,various types of network security problems appeared as the network continues to integrate into people's life.Therefore,network forensics technology has attracted wide attention and developed rapidly.Its core point is to extract and analyze the data information in the network.With the continuous development of data acquisition and analysis technologies,the traditional network data analysis methods cannot handle the massive data and cannot judge the network behavior accurately and efficiently.Currently,there are two major problems in network data analysis.First,the detection rate of existing anomaly detection methods cannot adapt to large-scale data.Second,the uncertainty of network data affects the accuracy of detection algorithms.In this paper,the outlier detection algorithm is used to collect,analyze and detect the data in the network,and finally,the normal behavior and abnormal behavior are judged.Firstly,some important methods of network data processing are studied and discussed,and then the feature selection method is used to preprocess the data in the network to reduce the complexity of subsequent anomaly detection.Finally,the data information is analyzed using the outlier detection algorithm based on Isolation Forest and LOF.The main work of this paper can be summarized into the following three aspects:(1)The feature selection method was studied,and an improved feature selection algorithm based on SVM-RFE and correlation information entropy was proposed.The characteristics of large amount and high dimension of network data make data preprocessing especially important,and feature selection is an effective data preprocessing method.Most of the existing feature selection methods only considered the importance of a single feature attribute relative to the decision result when it calculates the importance of attributes,however,the relationship between the features and the relationship between the features and the categories are ignored.Therefore,the initial feature subset was quickly selected by SVM-REF in this paper.And then a forward greedy search strategy using correlation information entropy was used for selecting features.Finally,the best subset of feature was chosen.Experimental results showed that the proposed method improve the performance of subsequent algorithms effectively.(2)Outlier detection algorithm for uncertain data was studied,an outlier detection algorithm based on Isolation Forest and LOF was proposed.For the problem of inflated possible world in uncertain data,new probabilistic dimensions,etc.,the initial outliers were quickly screened out by Isolation Forest,and then the final outlier was detected by the redefined LOF value of the uncertain data.Compared with the other detection algorithms,the algorithm in this paper can effectively detect outliers,improve efficiency,and has good robustness.(3)A network forensics system based on Isolation Forest and LOF was designed.Based on the analysis of network data,the corresponding function module was designed for each process.And the improved feature selection based on the SVM-RFE and correlation information entropy algorithm and outlier detection algorithm based on Isolation Forest and LOF were applied to the corresponding module.Finally,the framework of network forensics system based on Isolation Forest and LOF was achieved.The system can effectively analyze the network data and judge the normal and abnormal behavior of the network.
Keywords/Search Tags:Network Forensics, Uncertain Data, Outlier Detection, Feature Selection
PDF Full Text Request
Related items